CONSULTING 
PSYCHOLOGY 








m j§ AMERICAN rol fen ASSOCIATION 





April, 1957 Vol. 21, No. 2 
Contents 


The Necessary and Sufficient Conditions of Therapeutic Personality Change: Carl R. Rogers- 95 

Predictive Empathy and The Study of Values: Howard M. Halpern - - - - - - = 104 

The Validity of Judgments Based on “Blind” Rorschach Records: Guinevere S. Chambers 
and Roy M. Hamlin - - - - - - = = = = = = = = = © © = = = = = 105 


A Comparison of Client and Therapist Ratings on Two ada Variables : 
Malcolm H. Robertson- - - - - - - - = - = = wi &. sofeh &\)< asin O00 


Studies in Fantasy bres meena Frequency and Rorschach Scoring Categories: Horace 
A. Page - - - 2. ire le = -hmiviell ft wire a w « 2 oe © « = » ff 


Levels of Prediction from the TAT : Seymour Fisher and Robert B. Morton- - - - - 115 

Validities of Abbreviated WAIS Scales: Hileen Maxwell - - - - a - 121 

The Effect of Distrust on Some Aspects of Intelligence Test Behavior : Gerald Wiener - - 127 

Reliability (Internal Consistency) of the Wechsler Memory Scale and Correlation with the 
Wechsler-Bellevue Intelligence Scale: Julia C. Hall - - - - 

Subtest Disparity of Negro and White Groups Matched for IQs on the I Rev ised Beta Test: 
Walter A. Woods and Robert Toal- - - iene « . 


Revised Administration and Scoring of the Digit — Test: Harold L. Blackburn and 
Arthur L. Benton - - - - - 2 © + - © = — 


The Relationship of the WISC and Stanford-Binet to School Ac sitet Ernest S. Barratt 
and Doris L. Baumgarien - - - - - - = = - - . - 

Self-Acceptance and Psychopathology: Marvin Zuckerman and Irwin Monashkin - - - - 

Disorientation as a Prognostic Criterion: A. Eskey and Gladys Miller Friedman - - 

On the Relation Between A-Scale Scores and Digit Symbol Performance: Leonard D. Good- 
sleim and I.E. Farber- - - - = - + - = -+ = = = = = = = = = 

Some Social and Cultural Factors Determining Relations Between Authoritarianism and 
Measures of Neuroticism: Anthony Davids and Charles W. Eriksen - 


Spoken and Written Vocabulary ; Their Relation to a Standard V sass Test, Intelligence € 
and Anxiety: Maurice W. Sullivan and Allen D. Calvin - - 

A Cross-Cultural Comparison of the MMPI: Ronald Taft - - - - - - - - =- - 

“Response to the Human Face as a Standard Stimulus”: A Re-examination: Ernest G. Beier, 
Carroll E. Izard, Charles D. Smock, and Roland R. Tougas- - - - - - - - - 

Adjustment Testing and Personality Factors of the Blind: Sidney I. Dean - 

Childrearing Attitudes of Emotionally Disturbed Adolescents: George Spivack 

Age, Vocabulary, Anxiety, and Brain Damage as Factors in Verbal mapa h Pe 
Roberisom - - - ++ -++2+2+2+2 22222268 = - - . 

The Stability of the Social Desirability Scale Values in the Edwards Personal Preference 
Schedule: C. James Klett - - - - - = = -+ = = = = = = - 


Overinclusive Thinking in a Depressive and a Control Group: R. W. haste dd Heather L. 
Pn cs as ae be Sie ok ew ote wo = es 6 e - 186 











The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 


1. Sends the Brief Report, limited to one printed 
page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 
charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 
Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 





Brief Reports 


To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 75 lines av- 
eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 75 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 75-line quota: * 


1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. , Te- 
mitting $—— for microfilm or $—— for photo- 
copies. 





Extended report. The full report is pre- 
pared in the style specified by the Publica- 
tion Manual (1), except that it may be typed 
with single spacing for economy in photo- 
duplication by the ADI. 


Reference 


1. American Psychological Association. Council of 
Editors. Publication manual of the American 
Psychological Association (1957 rev.). Wash- 
ington, D. C.: American Psychological Asso- 
ciation, 1957. 








os ea a 











Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


The Necessary and Sufficient Conditions of 
Therapeutic Personality Change 


Carl R. Rogers 


University of Chicago 


For many years I have been engaged in 
psychotherapy with individuals in distress. 
In recent years I have found myself increas- 
ingly concerned with the process of abstract- 
ing from that experience the general prin- 
ciples which appear to be involved in it. I 
have endeavored to discover any orderliness, 
any unity which seems to inhere in the subtle, 
complex tissue of interpersonal relationship in 
which I have so constantly been immersed in 
therapeutic work. One of the current prod- 
ucts of this concern is an attempt to state, in 
formal terms, a theory of psychotherapy, of 
personality, and of interpersonal relationships 
which will encompass and contain the phe- 
nomena of my experience.’ What I wish to do 
in this paper is to take one very small seg- 
ment of that theory, spell it out more com- 
pletely, and explore its meaning and useful- 
ness. 


The Problem 


The question to which I wish to address 
myself is this: Is it possible to state, in terms 
which are clearly definable and measurable, 
the psychological conditions which are both 
necessary and sufficient to bring about con- 
structive personality change? Do we, in other 
words, know with any precision those ele- 


1 This formal statement is entitled “A theory of 
therapy, personality and interpersonal relationships, 
as developed in the client-centered framework,” by 
Carl R. Rogers. The manuscript was prepared at the 
request of the Committee of the American Psycho- 
logical Association for the Study of the Status and 
Development of Psychology in the United States. It 
will be published by McGraw-Hill in one of several 
volumes being prepared by this committee. Copies of 
the unpublished manuscript are available from the 
author to those with special interest in this field. 


ments which are essential 
peutic change is to ensue? 

Before proceeding to the major task let me 
dispose very briefly of the second portion of 
the question. What is meant by such phrases 
as “psychotherapeutic change,” “constructive 
personality change”? This problem also de- 
serves deep and serious consideration, but for 
the moment let me suggest a common-sense 
type of meaning upon which we can perhaps 
agree for purposes of this paper. By these 
phrases is meant: change in the personality 
structure of the individual, at both surface 
and deeper levels, in a direction which cli- 
nicians would agree means greater integration 
less internal conflict, more energy utilizable 
for effective living; change in behavior away 
from behaviors generally regarded as imma 
ture and toward behaviors regarded as ma 
ture. This brief description may suffice to in- 
dicate the kind of change for which we are 
considering the preconditions. It may also 
suggest the ways in which this criterion of 
change may be determined.’ 


if psychothera- 


The Conditions 


As I have considered my own clinical ex- 
perience and that of my colleagues, together 
with the pertinent research which is avail- 
able, I have drawn out several conditions 
which seem to me to be mecessary to initiate 
constructive personality change, and which, 
taken together, appear to be sufficient to in- 
augurate that process. As I have worked on 
this problem I have found myself surprised 
at the simplicity of what has emerged. The 


2 That this is a measurable and determinable cri- 
terion has been shown in research already completed 
See (7), especially chapters 8, 13, and 17. 





96 Carl R. 


statement which follows is not offered with 
any assurance as to its correctness, but with 
/Athe expectation that it will have the value of 
/ any theory, namely that it states or implies 
“a series of hypotheses which are open to proof 
or disproof, thereby clarifying and extending 
our knowledge of the field. 

Since I am not, in this paper, trying to 
achieve suspense, I will state at once, in se- 
verely rigorous and summarized terms, the 
six conditions which I have come to feel are 
basic to the process of personality change. 
The meaning of a number of the terms is not 
immediately evident, but will be clarified in 
the explanatory sections which follow. It is 
hoped that this brief statement will have 
much more significance to the reader when 
he has completed the paper. Without further 
introduction let me state the basic theoreti- 
cal position. 

For constructive personality change to oc- 
cur, it is necessary that these conditions exist 
and continue over a period of time: 


1. Two persons are in psychological con- 
tact. 

2. The first, whom we shall term the client, 
is in a state of incongruence, being vulnerable 
or anxious. 

3. The second person, whom we shall term 
the therapist, is congruent or integrated in 
the relationship. 

4. The therapist experiences unconditional 
positive regard for the client. 

5. The therapist experiences an empathic 
understanding of the client’s internal frame of 
reference and endeavors to communicate this 
experience to the client. 

6. The communication to the client of the 
therapist’s empathic understanding and un- 
conditional positive regard is to a minimal 
degree achieved. 


No other conditions are necessary. If these 
six conditions exist, and continue over a pe- 
riod of time, this is sufficient. The process of 
constructive personality change will follow. 


A Relationship 


The first condition specifies that a minimal 
relationship, a psychological contact, must 
exist. I am hypothesizing that significant 
positive personality change does not occur ex- 


Rogers 


cept in a relationship. This is of course an 
hypothesis, and it may be disproved. 

Conditions 2 through 6 define the charac- 
teristics of the relationship which are re- 
garded as essential by defining the necessary 
characteristics of each person in the relation- 
ship. All that is intended by this first condi- 
tion is to specify that the two people are to 
some degree in contact, that each makes some 
perceived difference in the experiential field 
of the other. Probably it is sufficient if 
each makes some “subceived” difference, even 
though the individual may not be consciously 
aware of this impact. Thus it might be diffi- 
cult to know whether a catatonic patient per- 
ceives a therapist’s presence as making a dif- 
ference to him—a difference of any kind— 
but it is almost certain that at some organic 
level he does sense this difference. 

Except in such a difficult borderline situa- 
tion as that just mentioned, it would be rela- 
tively easy to define this condition in op- 
erational terms and thus determine, from a 
hard-boiled research point of view, whether 
the condition does, or does not, exist. The 
simplest method of determination involves 
simply the awareness of both client and 
therapist. If each is aware of being in per- 
sonal or psychological contact with the other, 
inen this condition is met. 

This first condition of therapeutic change 
is such a simple one that perhaps it should 
be labeled an assumption or a precondition 
in order to set it apart from those that fol- 
low. Without it, however, the remaining items 
would have no meaning, and that is the rea- 
son for including it. 


The State of the Client 


It was specified that it is necessary that 
the client be “in a state of incongruence, be- 
ing vulnerable or anxious.”’ What is the mean- 
ing of these terms? 

Incongruence is a basic construct in the 
theory we have been developing. It refers to 
a discrepancy between the actual experience 
of the organism and the self picture of the 
individual insofar as it represents that experi- 
ence. Thus a student may experience, at a 
total or organismic level, a fear of the uni- 
versity and of examinations which are given 
on the third floor of a certain building, since 


baat ht aoa 








Conditions of Therapeutic Personality Change 97 


these may demonstrate a fundamental inade- 
quacy in him. Since such a fear of his inade- 
quacy is decidedly at odds with his concept 
of himself, this experience is represented (dis- 
tortedly) in his awareness as an unreasonable 
fear of climbing stairs in this building, or any 
building, and soon an unreasonable fear of 
crossing the open campus. Thus there is a 
fundamental discrepancy between the experi- 
enced meaning of the situation as it registers 
in his organism and the symbolic representa- 
tion of that experience in awareness in such 
a way that it does not conflict with the pic- 
ture he has of himself. In this case to admit 
a fear of inadequacy would contradict the 
picture he holds of himself; to admit incom- 
prehensible fears does not contradict his self 
concept. 

Another instance would be the mother who 
develops vague illnesses whenever her only 
son makes plans to leave home. The actual 
desire is to hold on to her only source of 
satisfaction. To perceive this in awareness 
would be inconsistent with the picture she 
holds of herself as a good mother. Illness, 
however, is consistent with her self concept, 
and the experience is symbolized in this dis- 
torted fashion. Thus again there is a basic 
incongruence between the self as perceived 
(in this case as an ill mother needing atten- 
tion) and the actual experience (in this case 
the desire to hold on to her son). 

When the individual has no awareness of 
such incongruence in himself, then he is 
merely vulnerable to the possibility of anxiety 
and disorganization. Some experience might 
occur so suddenly or so obviously that the in- 
congruence could not be denied. Therefore, 
the person is vulnerable to such a possibility. 

If the individual dimly perceives such an 
incongruence in himself, then a tension state 
occurs which is known as anxiety. The in- 
congruence need not be sharply perceived. It 
is enough that it is subceived—that is, dis- 
criminated as threatening to the self without 
any awareness of the content of that threat. 
Such anxiety is often seen in therapy as the 
individual approaches awareness of some ele- 
ment of his experience which is in sharp con- 
tradiction to his self concept. 

It is not easy to give precise operational 
definition to this second of the six conditions, 


yet to some degree this has been achieved. 
Several research workers have defined the self 
concept by means of a Q sort by the indi- 
vidual of a list of self-referent items. This 
gives us an operational picture of the self. 
The total experiencing of the individual is 
more difficult to capture. Chodorkoff (2) has 
defined it as a Q sort made by a clinician who 
sorts the same self-referent items independ- 
ently, basing his sorting on the picture he has 
obtained of the individual from projective 
tests. His sort thus includes unconscious as 
well as conscious elements of the individual’s 
experience, thus representing (in an admit- 
tedly imperfect way) the totality of the cli- 
ent’s experience. The correlation between these 
two sortings gives a crude operational measure 
of incongruence between self and experience, 
low or negative correlation representing of 
course a high degree of incongruence. 


The Therapist’s Genuineness in the Relation- 
ship 


The third condition is that the therapist 
should be, within the confines of this rela- 
tionship, a congruent, genuine, integrated per- 
son. It means that within the relationship he 
is freely and deeply himself, with his actual 
experience accurately represented by his 
awareness of himself. It is the opposite of 
presenting a facade, either knowingly or un- 
knowingly. 

It is not necessary (nor is it possible) that 
the therapist be a paragon who exhibits this 
degree of integration, of wholeness, in every 
aspect of his life. It is sufficient that he is ac- 
curately himself in this hour of this relation- 
ship, that in this basic sense he is what he 
actually is, in this moment of time. 

It should be clear that this includes being 
himself even in ways which are not regarded 
as ideal for psychotherapy. His experience 
may be “I am afraid of this client” or “My 
attention is so focused on my own problems 
that I can scarcely listen to him.” If the 
therapist is not denying these feelings to 
awareness, but is able freely to be them (as 
well as being his other feelings), then the 
condition we have stated is met. 

It would take us too far afield to consider 
the puzzling matter as to the degree to which 








98 Carl R. 
the therapist overtly communicates this re- 
ality in himself to the client. Certainly the 
aim is not for the therapist to express or talk 
out his own feelings, but primarily that he 
should not be deceiving the client as to him- 
self. At times he may need to talk out some 
of his own feelings (either to the client, or to 
a colleague or supervisor) if they are stand- 
ing in the way of the two following conditions. 

It is not too difficult to suggest an opera- 
tional definition for this third condition. We 
resort again to Q technique. If the therapist 
sorts a series of items relevant to the relation- 
ship (using a list similar to the ones devel- 
oped by Fiedler [3, 4] and Bown [1]), this 
will give his perception of his experience in 
the relationship. If several judges who have 
observed the interview or listened to a re- 
cording of it (or observed a sound movie of 
it) now sort the same items to represent their 
perception of the relationship, this second 
sorting should catch those elements of the 
therapist’s behavior and inferred attitudes of 
which he is unaware, as well as those of 
which he is aware. Thus a high correlation 
between the therapist’s sort and the observ- 
er’s sort would represent in crude form an 
operational definition of the therapist’s con- 
gruence or integration in the relationship; 
and a low correlation, the opposite. 


Unconditional Positive Regard 


To the extent that the therapist finds him- 
self experiencing a warm acceptance of each 
aspect of the client’s experience as being a 
part of that client, he is experiencing uncon- 
ditional positive regard. This concept has 
been developed by Standal (8). It means that 
there are no conditions of acceptance, no feel- 
ing of “I like you only if you are thus and 
so.” It means a “prizing” of the person, as 
Dewey has used that term. It is at the op- 
posite pole from a selective evaluating atti- 
tude—“You are bad in these ways, good in 
those.” It involves as much feeling of ac- 
ceptance for the client’s expression of nega- 
tive, “bad,” painful, fearful, defensive, abnor- 
mal feelings as for his expression of “good,” 
positive, mature, confident, social feelings, as 
much acceptance of ways in which he is in- 
consistent as of ways in which he is consist- 


Rogers 


ent. It means a caring for the client, but not 
in a possessive way or in such a way as sim- 
ply to satisfy the therapist’s own needs. It 
means a caring for the client as a separate 
person, with permission to have his own feel- 
ings, his own experiences. One client describes 
the therapist as “fostering my possession of 
my own experience ... that [this] is my 
experience and that I am actually having it: 
thinking what I think, feeling what I feel, 
wanting what I want, fearing what I fear: no 
‘ifs,’ ‘buts,’ or ‘not reallys.’” This is the type 
of acceptance which is hypothesized as being 
necessary if personality change is to occur. 

Like the two previous conditions, this 
fourth condition is a matter of degree,’ as 
immediately becomes apparent if we attempt 
to define it in terms of specific research op- 
erations. One such method of giving it defi- 
nition would be to consider the Q sort for the 
relationship as described under Condition 3. 
To the extent that items expressive of uncon- 
ditional positive regard are sorted as charac- 
teristic of the relationship by both the thera- 
pist and the observers, unconditional positive 
regard might be said to exist. Such items 
might include statements of this order: “I 
feel no revulsion at anything the client says’’; 
“T feel neither approval nor disapproval of 
the client and his statements—simply accept- 
ance”; “I feel warmly toward the client—to- 
ward his weaknesses and problems as well as 
his potentialities”; “I am not inclined to pass 
judgment on what the client tells me”; “I 
like the client.” To the extent that both 
therapist and observers perceive these items 
as characteristic, or their opposites as un- 
characteristic, Condition 4 might be said to 
be met. 


8 The phrase “unconditional positive regard” may 
be an unfortunate one, since it sounds like an ab- 
solute, an all or nothing dispositional concept. It is 
probably evident from the description that com- 
pletely unconditional positive regard would never ex- 
ist except in theory. From a clinical and experiential 
point of view I believe the most accurate statement 
is that the effective therapist experiences uncondi- 
tional positive regard for the client during many mo- 
ments of his contact with him, yet from time to time 
he experiences only a conditional positive regard— 
and perhaps at times a negative regard, though this 
is not likely in effective therapy. It is in this sense 
that unconditional positive regard exists as a matter 
of degree in any relationship. 





ence 





Conditions of Therapeutic Personality Change 99 


Empathy 


The fifth condition is that the therapist is 
experiencing an accurate, empathic under- 
standing of the client’s awareness of his own 
experience. To sense the client’s private world 
as if it were your own, but without ever los- 
ing the “as if” quality—this is empathy, and 
this seems essential to therapy. To sense the 
client’s anger, fear, or confusion as if it were 
your own, yet without your own anger, fear, 
or confusion getting bound up in it, is the 
condition we are endeavoring to describe. 
When the client’s world is this clear to the 
therapist, and he moves about in it freely, 
then he can both communicate his under- 
standing of what is clearly known to the cli- 
ent and can also voice meanings in the cli- 
ent’s experience of which the client is scarcely 
aware. As one client described this second as- 
pect: “Every now and again, with me in a 
tangle of thought and feeling, screwed up in 
a web of mutually divergent lines of move- 
ment, with impulses from different parts of 
me, and me feeling the feeling of its being all 
too much and suchlike—then whomp, just 
like a sunbeam thrusting its way through 
cloudbanks and tangles of foliage to spread 
a circle of light on a tangle of forest paths, 
came some comment from you. [It was] 
clarity, even disentanglement, an additional 
twist to the picture, a putting in place. Then 
the consequence—the sense of moving on, the 
relaxation. These were sunbeams.” That such 
penetrating empathy is important for therapy 
is indicated by Fiedler’s research (3) in which 
items such as the following placed high in the 
description of relationships created by experi- 
enced therapists: 


The therapist is well able to understand the pa- 
tient’s feelings. 

The therapist is never in any doubt about what 
the patient means. 

The therapist’s remarks fit in just right with the 
patient’s mood and content. 

The therapist’s tone of voice conveys the com- 
plete ability to share the patient’s feelings. 


An operational definition of the therapist’s 
empathy could be provided in different ways. 
Use might be made of the Q sort described 
under Condition 3. To the degree that items 
descriptive of accurate empathy were sorted 
as Characteristic by both the therapist and the 


observers, this condition would be regarded 
as existing. 

Another way of defining this condition 
would be for both client and therapist to sort 
a list of items descriptive of client feelings. 
Each would sort independently, the task be- 
ing to represent the feelings which the client 
had experienced during a just completed in- 
terview. If the correlation between client and 
therapist sortings were high, accurate empathy 
would be said to exist, a low correlation indi- 
cating the opposite conclusion. 

Still another way of measuring empathy 
would be for trained judges to rate the depth 
and accuracy of the therapist’s empathy on 
the basis of listening to recorded interviews. 


The Client’s Perception of the Therapist 

The final condition as stated is that the cli- 
ent perceives, to a minimal degree, the ac- 
ceptance and empathy which the therapist 
experiences for him. Unless some communica- 
tion of these attitudes has been achieved, 
then such attitudes do not exist in the rela- 
tionship as far as the client is concerned, and 
the therapeutic process could not, by our hy- 
pothesis, be initiated. 

Since attitudes cannot be directly perceived, 
it might be somewhat more accurate to state 
thac therapist behaviors and words are per- 
ceived by the client as meaning that to some 
degree the therapist accepts and understands 
him. 

An operational definition of this condition 
would not be difficult. The client might, after 
an interview, sort a Q-sort list of items re- 
ferring to qualities representing the relation- 
ship between himself and the therapist. (The 
same list could be used as for Condition 3.) 
If several items descriptive of acceptance and 
empathy are sorted by the client as charac- 
teristic of the relationship, then this condi- 
tion could be regarded as met. In the present 
state of our knowledge the meaning of “to a 
minimal degree” would have to be arbitrary. 


Some Comments 


Up to this point the effort has been made 
to present, briefly and factually, the condi- 
tions which I have come to regard as essen- 
tial for psychotherapeutic change. I have not 
tried to give the theoretical context of these 








100 


conditions nor to explain what seem to me to 
be the dynamics of their effectiveness. Such 
explanatory material will be available, to the 
reader who is interested, in the document al- 
ready mentioned (see footnote 1). 

I have, however, given at least one means 
of defining, in operational terms, each of the 
conditions mentioned. I have done this in or- 
der to stress the fact that I am not speaking 
of vague qualities which ideally should be 
present if some other vague result is to occur. 
I am presenting conditions which are crudely 
measurable even in the present state of our 
technology, and have suggested specific op- 
erations in each instance even though I am 
sure that more adequate methods of measure- 
ment could be devised by a serious investi- 
gator. 

My purpose has been to stress the notion 
that in my opinion we are dealing with an 
if-then phenomenon in which knowledge of 
the dynamics is not essential to testing the 
hypotheses. Thus, to illustrate from another 
field: if one substance, shown by a series of 
operations to be the substance known as hy- 
drochloric acid, is mixed with another sub- 
stance, shown by another series of operations 
to be sodium hydroxide, then salt and water 
will be products of this mixture. This is true 
whether one regards the results as due to 
magic, or whether one explains it in the most 
adequate terms of modern chemical theory. 
In the same way it is being postulated here 
that certain definable conditions precede cer- 
tain definable changes and that this fact ex- 
ists independently of our efforts to account 
for it. 


The Resulting Hypotheses 


The major value of stating any theory in 
unequivocal terms is that specific hypotheses 
may be drawn from it which are capable of 
proof or disproof. Thus, even if the condi- 
tions which have been postulated as necessary 
and sufficient conditions are more incorrect 
than correct (which I hope they are not), 
they could still advance science in this field 
by providing a base of operations from which 
fact could be winnowed out from error. 

The hypotheses which would follow from 
the theory given would be of this order: 


Carl R. Rogers 





If these six conditions (as operationally de- 
fined) exist, then constructive personality 
change (as defined) will occur in the client. 

If one or more of these conditions is not 
present, constructive personality change will 
not occur. 


These hypotheses hold in any situation 
whether it is or is not labeled “psychother- 
apy.” 

Only Condition 1 is dichotomous (it either 
is present or is not), and the remaining five 
occur in varying degree, each on its con- 
tinuum. Since this is true, another hypothesis 
follows, and it is likely that this would be the 
simplest to test: 


If all six conditions are present, then the 
greater the degree to which Conditions 2 to 6 
exist, the more marked will be the construc- 
tive personality change in the client. 


At the present time the above hypothesis can 
only be stated in this general form—which 
implies that all of the conditions have equal 
weight. Empirical studies will no doubt make 
possible much more refinement of this hy- 
pothesis. It may be, for example, that if anx- 
iety is high in the client, then the other con- 
ditions are less important. Or if unconditional 
positive regard is high (as in a mother’s love 
for her child), then perhaps a modest degree 
of empathy is sufficient. But at the moment 
we can only speculate on such possibilities. 


Some Implications 
Significant Omissions 


If there is any startling feature in the for- 
mulation which has been given as to the nec- 
essary conditions for therapy, it probably lies 
in the elements which are omitted. In pres- 
ent-day clinical practice, therapists operate as 
though there were many other conditions in 
addition to those described, which are essen- 
tial for psychotherapy. To point this up it 
may be well to mention a few of the condi- 
tions which, after thoughtful consideration of 
our research and our experience, are not in- 
cluded. 

For example, it is mot stated that these con- 
ditions apply to one type of client, and that 
other conditions are necessary to bring about 





— ines os ae a [eee 











psychotherapeutic change with other types of 
client. Probably no idea is so prevalent in 
clinical work today as that one works with 
neurotics in one way, with psychotics in an- 
other; that certain therapeutic conditions 
must be provided for compulsives, others for 
homosexuals, etc. Because of this heavy 
weight of clinical opinion to the contrary, it 
is with some “fear and trembling” that I ad- 
vance the concept that the essential condi- 
tions of psychotherapy exist in a single con- 
figuration, even though the client or patient 
may use them very differently.‘ 

It is mot stated that these six conditions 
are the essential conditions for client-centered 
therapy, and that other conditions are essen- 
tial for other types of psychotherapy. I cer- 
tainly am heavily influenced by my own ex- 
perience, and that experience has led me to a 
viewpoint which is termed “client centered.” 
Nevertheless my aim in stating this theory is 
to state the conditions which apply to any 
situation in which constructive personality 
change occurs, whether we are thinking of 
classical psychoanalysis, or any of its modern 
offshoots, or Adlerian psychotherapy, or any 
other. It will be obvious then that in my 
judgment much of what is considered to be 
essential would not be found, empirically, to 
be essential. Testing of some of the stated 
hypotheses would throw light on this per- 
plexing issue. We may of course find that 
various therapies produce various types of 
personality change, and that for each psycho- 
therapy a separate set of conditions is neces- 
sary. Until and unless this is demonstrated, I 

*TI cling to this statement of my hypothesis even 
though it is challenged by a just completed study by 
Kirtner (5). Kirtner has found, in a group of 26 
cases from the Counseling Center at the University 
of Chicago, that there are sharp differences in the 
client’s mode of approach to the resolution of life 
difficulties, and that these differences are related to 
success in psychotherapy. Briefly, the client who 
sees his problem as involving his relationships, and 
who feels that he contributes to this problem and 
wants to change it, is likely to be successful. The 
client who externalizes his problem, feeling little self- 
responsibility, is much more likely to be a failure. 
Thus the implication is that some other conditions 
need to be provided for psychotherapy with this 
group. For the present, however, I will stand by my 
hypothesis as given, until Kirtner’s study is con- 
firmed, and until we know an alternative hypothe- 
sis to take its place. 





Conditions of Therapeutic Personality Change 





101 


am hypcthesizing that effective psychotherapy 
of any sort produces similar changes in per- 
sonality and behavior, and that a single set 
of preconditions is necessary. 

It is mot stated that psychotherapy is a 
special kind of relationship, different in kind 
from all others which occur in everyday life. 
It will be evident instead that for brief mo- 
ments, at least, many good friendships fulfill 
the six conditions. Usually this is only mo- 
mentarily, however, and then empathy falters, 
the positive regard becomes conditional, or 
the congruence of the “therapist” friend be 
comes overlaid by some degree of facade or 
defensiveness. Thus the therapeutic relation 
ship is seen as a heightening of the construc- 
tive qualities which often exist in part in 
other relationships, and an extension through 
time of qualities which in other relationships 
tend at best to be momentary. 

It is mot stated that special intellectual 
professional knowledge—psychological, psy 
chiatric, medical, or religious—is required of 
the therapist. Conditions 3, 4, and 5, which 
apply especially to the therapist, are quali 
ties of experience, not intellectual informa- 
tion. If they are to be acquired, they must 
in my opinion, be acquired through an ex 
periential training—which may be, but usu 
ally is not, a part of professional training. It 
troubles me to hold such a radical point of 
view, but I can draw no other conclusion from 
my experience. Intellectual training and the 
acquiring of information has, I believe, many 
valuable results—but becoming a therapist is 
not one of those results. 

It is mot stated that it is necessary for psy- 
chotherapy that the therapist have an accu- 
rate psychological diagnosis of the client. 
Here too it troubles me to hold a viewpoint 
so at variance with my clinical colleagues 
When one thinks of the vast proportion of 
time spent in any psychological, psychiatric, 
or mental hygiene center on the exhaustive 
psychological evaluation of the client or pa- 
tient, it seems as though this must serve a 
useful purpose insofar as psychotherapy is 
concerned. Yet the more I have observed 
therapists, and the more closely I have studied 
research such as that done by Fiedler and 
others (4), the more I am forced to the con- 
clusion that such diagnostic knowledge is not 


102 Carl R. 
essential to psychotherapy.® It may even be 
that its defense as a necessary prelude to psy- 
chotherapy is simply a protective alternative 
to the admission that it is, for the most part, 
a colossal waste of time. There is only one 
useful purpose I have been able to observe 
which relates to psychotherapy. Some thera- 
pists cannot feel secure in the relationship 
with the client unless they possess such diag- 
nostic knowledge. Without it they feel fearful 
of him, unable to be empathic, unable to ex- 
perience unconditional regard, finding it nec- 
essary to put up a pretense in the relation- 
ship. If they know in advance of suicidal 
impulses they can somehow be more accept- 
ant of them. Thus, for some therapists, the 
security they perceive in diagnostic infor- 
mation may be a basis for permitting them- 
selves to be integrated in the relationship, 
and to experience empathy and full accept- 
ance. In these instances a psychological diag- 
nosis would certainly be justified as adding to 
the comfort and hence the effectiveness of the 
therapist. But even here it does not appear to 
be a basic precondition for psychotherapy.*® 

Perhaps I have given enough illustrations 
to indicate that the conditions I have hy- 
pothesized as necessary and sufficient for psy- 
chotherapy are striking and unusual pri- 
marily by virtue of what they omit. If we 
were to determine, by a survey of the be- 
haviors of therapists, those hypotheses which 
they appear to regard as necessary to psy- 
chotherapy, the list would be a great deal 
longer and more complex. 


Is This Theoretical Formulation Useful? 


Aside from the personal satisfaction it gives 
as a venture in abstraction and generalization, 
what is the value of a theoretical statement 


5 There is no intent here to maintain that diag- 
nostic evaluation is useless. We have ourselves made 
heavy use of such methods in our research studies of 
change in personality. It is its usefulness as a pre- 
condition to psychotherapy which is questioned. 

6 In a facetious moment I have suggested that such 
therapists might be made equally comfortable by be- 
ing given the diagnosis of some other individual, not 
of this patient or client. The fact that the diagnosis 
proved inaccurate as psychotherapy continued would 
not be particularly disturbing, because one always 
expects to find inaccuracies in the diagnosis as one 
works with the individual. 





Rogers 


such as has been offered in this paper? I 
should like to spell out more fully the useful- 
ness which I believe it may have. 

In the field of research it may give both 
direction and impetus to investigation. Since 
it sees the conditions of constructive person- 
ality change as general, it greatly broadens 
the opportunities for study. Psychotherapy is 
not the only situation aimed at constructive 
personality change. Programs of training for 
leadership in industry and programs of train- 
ing for military leadership often aim at such 
change. Educational institutions or programs 
frequently aim at development of character 
and personality as well as at intellectual skills. 
Community agencies aim at personality and 
behavioral change in delinquents and crimi- 
nals. Such programs would provide an oppor- 
tunity for the broad testing of the hypotheses 
offered. If it is found that constructive per- 
sonality change occurs in such programs when 
the hypothesized conditions are not fulfilled, 
then the theory would have to be revised. If 
however the hypotheses are upheld, then the 
results, both for the planning of such pro- 
grams and for our knowledge of human dy- 
namics, would be significant. In the field of 
psychotherapy itself, the application of con- 
sistent hypotheses to the work of various 
schools of therapists may prove highly profit- 
able. Again the disproof of the hypotheses of- 
fered would be as important as their confir- 
mation, either result adding significantly to 
our knowledge. 

For the practice of psychotherapy the the- 
ory also offers significant problems for con- 
sideration. One of its implications is that the 
techniques of the various therapies are rela- 
tively unimportant except to the extent that 
they serve as channels for fulfilling one of the 
conditions. In client-centered therapy, for ex- 
ample, the technique of “reflecting feelings”’ 
has been described and commented on (6, pp. 
26-36). In terms of the theory here being pre- 
sented, this technique is by no means an es- 
sential condition of therapy. To the extent, 
however, that it provides a channel by which 
the therapist communicates a sensitive em- 
pathy and an unconditional positive regard, 
then it may serve as a technical channel by 
which the essential conditions of therapy are 
fulfilled. In the same way, the theory I have 


















Wt pltemen es 45) 


Conditions of Therapeutic Personality Change 103 


presented would see no essential value to 
therapy of such techniques as interpretation 
of personality dynamics, free association, 
analysis of dreams, analysis of the transfer- 
ence, hypnosis, interpretation of life style, 
suggestion, and the like. Each of these tech- 
niques may, however, become a channel for 
communicating the essential conditions which 
have been formulated. An interpretation may 
be given in a way which communicates the 
unconditional positive regard of the therapist. 
A stream of free association may be listened 
to in a way which communicates an empathy 
which the therapist is experiencing. In the 
handling of the transference an effective 
therapist often communicates his own whole- 
ness and congruence in the relationship. Simi- 
larly for the other techniques. But just as 
these techniques may communicate the ele- 
ments which are essential for therapy, so any 
one of them may communicate attitudes and 
experiences sharply contradictory to the hy- 
pothesized conditions of therapy. Feeling may 
be “reflected” in a way which communicates 
the therapist’s lack of empathy. Interpreta- 
tions may be rendered in a way which indi- 
cates the highly conditional regard of the 
therapist. Any of the techniques may com- 
municate the fact that the therapist is ex- 
pressing one attitude at a surface level, and 
another contradictory attitude which is de- 
nied to his own awareness. Thus one value of 
such a theoretical formulation as we have of- 
fered is that it may assist therapists to think 
more critically about those elements of their 
experience, attitudes, and behaviors which 
are essential to psychotherapy, and those 
which are nonessential or even deleterious to 
psychotherapy. 

Finally, in those programs—educational, 
correctional, military, or industrial—which 
aim toward constructive changes in the per- 
sonality structure and behavior of the indi- 
vidual, this formulation may serve as a very 
tentative criterion against which to measure 
the program. Until it is much further tested 
by research, it cannot be thought of as a 
valid criterion, but, as in the field of psycho- 


therapy, it may help to stimulate critical 
analysis and the formulation of alternative 
conditions and alternative hypotheses. 


Summary 


Drawing from a larger theoretical context, 
six conditions are postulated as necessary and 
sufficient conditions for the initiation of a 
process of constructive personality change. A 
brief explanation is given of each condition, 
and suggestions are made as to how each may 
be operationally defined for research purposes. 
The implications of this theory for research, 
for psychotherapy, and for educational and 
training programs aimed at constructive per- 
sonality change, are indicated. It is pointed 
out that many of the conditions which are 
commonly regarded as necessary to psycho- 
therapy are, in terms of this theory, non- 
essential. 


Received June 6, 1956. 


References 


1. Bown, O. H. An investigation of therapeutic re- 
lationship in client-centered therapy. Unpub- 
lished doctor’s dissertation, Univer. of Chi- 


cago, 1954. 
2. Chodorkoff, B. Self-perception, perceptual defense, 
and adjustment. J. abnorm. soc. Psychol, 


1954, 49, 508-512. 

3. Fiedler, F. E. A comparison of therapeutic rela 
tionships in psychoanalytic, non-directive and 
Adlerian therapy. J. consult. Psychol., 1950, 
14, 436-445. 

4. Fiedler, F. E. Quantitative studies on the role of 

therapists’ feelings toward their patients. In 
O. H. Mowrer (Ed.), Psychotherapy: theory 
and research. New York: Ronald, 1953. 

. Kirtner, W. L. Success and failure in client-cen- 
tered therapy as a function of personality 
variables. Unpublished master’s thesis, Univer 
of Chicago, 1955. 

6. Rogers, C. R. Client-centered therapy. Boston 

Houghton Mifflin, 1951. 

7. Rogers, C. R., & Dymond, Rosalind F. (Eds.) 
Psychotherapy and personality change. Chi- 
cago: Univer. of Chicago Press, 1954. 

8. Standal, S. The need for positive regard: a con- 
tribution to client-centered theory. Unpub- 
lished doctor’s dissertation, Univer. of Chi- 
cago, 1954. 


wn 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Predictive Empathy and The Study of Values’ 


Howard M. Halpern 
Bronx VA Hospital 


In the most commonly employed measure 
of empathy, subjects are required to predict 
the rating behavior of others. The use of pre- 
dictions makes sense since it provides an op- 
erational measure of empathy that makes it 
a manageable concept. Perhaps the greatest 
procedural objection to the predictive method 
is that it requires a very special and cumber- 
some set of conditions—namely, the existence 
and cooperation of a group of acquaintances 
of the subject. There is now no adequate way 
to measure a person’s empathy individually. 

To construct an individual empathy test, it 
will be necessary to first know what kind of 
personality attributes correlate well with pre- 
dictive empathy. The purpose of this brief 
article is to put into the literature a correla- 
tional study that may be used toward that 
end. 

A sample of 37 female nurses was divided 
into four groups as previously reported (3). 
Their predictive accuracy in rating five fel- 
low group members on an 80-item inventory 
was determined. This was correlated with the 
Allport-Vernon-Lindzey Study of Values (1). 
The following correlations were found to hold: 
Social, .355; religious, .203; economic, .108; 
political, .060; theoretical, — .086; esthetic, 
— 338. 

The only correlations significant at the .05 
level were the positive correlation of predic- 
tions with Social Values and the negative cor- 
relation of predictions with Esthetic Values. 

The social type is summarized in the manual 


of directions to the Study of Values as fol- 
lows: “The highest value for this type is love 
of people. . . . The social man prizes other 
persons as ends, and is therefore himself kind, 
sympathetic and unselfish. . . . In its purest 
form the social interest is selfless and tends to 
approach very closely to the religious atti- 
tude” (2, p. 14). 

Since there have been studies indicating 
that psychologists with artistic interests are 
better empathizers than those without artistic 
interests, the negative correlation of predic- 
tive accuracy with esthetic values needs ex- 
planation. The manual is again instructive. 
“The Esthetic” is described as a man who 
“sees his highest value in form and harmony. 
Each single experience is judged from the 
standpoint of grace, symmetry, or fitness. 
. . - He need not be a creative artist. . . . In 
social affairs he may be said to be interested 
in persons but not in the welfare of persons 
al: eR 


Received July 27, 1956. 


References 


1. Allport, G. W., Vernon, P. E., & Lindzey, G. 
Study of values. Boston: Houghton Mifflin, 
1951. 

2. Allport, G. W., Vernon, P. E., & Lindzey, G. 
Study of values: manual of directions. Bos- 
ton: Houghton Mifflin, 1951. 

3. Halpern, H. M. Empathy, similarity and self- 
satisfaction. J. consult. Psychol., 1955, 19, 
449-452. 


104 











Journal 1 oni Psychology 
Vol. 21, No. 2, 195 


The Validity of Judgments Based on “Blind” 
Rorschach Records’ 


Guinevere S. Chambers and Roy M. Hamlin 


Western Psychiatric Institute, University of Pittsburgh 


The sparsity of objective, controlled studies 
bearing on the question of whether the Ror- 
schach can validly identify clinical groups is 
brought into sharp focus by Ainsworth in the 
recent book on the Rorschach by Klopfer 
et al. (8). In an extensive review of validity 
research, she cites three studies as being the 
most significant on this point (1, 5, 9). The 
whole question of correlation between Ror- 
schach evidence and clinical groups is dis- 
missed in a rather unconcerned fashion. She 
is more interested in validating “underlying 
principles” and the inference is that no one 
with a true appreciation for projective pro- 
cedures would bother about such a question 
anyway. To quote her specifically: “all these 
‘blind’ interpretations and diagnoses seem to 
be more of a tour de force to impress the 
skeptic than to represent a serious attempt 
to test out the basic hypotheses upon which 
both interpretation and diagnosis are based” 
(8, p. 463). 

In view of Ainsworth’s admission that other 
approaches to validation have not added 
much to the security of the clinician’s posi- 
tion, it appears a bit early to dismiss the pro- 
cedure of correlating the Rorschach with the 
outside criterion of clinical groups on which 
there is at least some degree of agreement. 
While “underlying hypotheses” are admittedly 
important, they depend for their significance 
on the over-all validity of the technique as it 
is put to use. 

As the Rorschach is used in actual clinical 


1 This article is based on a dissertation submitted 
in partial fulfillment of the requirements for the de- 
gree of Doctor of Philosophy, University of Pitts- 
burgh. Appreciation is expressed to Dr. A. W. Bendig 
for his advice on statistical techniques. 


practice, the clinician and the tool are an 
entity. Attempts to validate the Rorschach 
“with the interpreter attached” (7) have met 
with widely varying degrees of success. Ham- 
lin (4) in comparing 10 such studies con- 
cludes that the disparity in results is a func- 
tion of differences in methodology mainly re- 
lated to the size of units employed and the 
over-all complexity of the judgment task as- 
signed to the individual psychologist. 

The present experiment is designed to meet 
Hamlin’s conditions of presenting the cli- 
nician with an adequate sample of material 
pertinent to the judgment required (total 
Rorschach) and of keeping the judgment task 
from being too complex (a single judgment 
on each of five Rorschachs). Every effort was 
made to achieve a stable criterion against 
which to check the Rorschach and to clearly 
define and delimit the task presented to the 
psychologist judges. This study simply asks: 
(a) Can clinicians validly identify patient 
groups on the basis of “blind” Rorschachs? 
(5) Is there a difference in the Rorschach 
elements used as a basis for interpretation by 
clinicians with varying degrees of success on 
this task? 


Method 


Twenty psychologists were each asked to 
identify five Rorschachs according to clinical 
group. The clinical groups were limited to 
five, and the Rorschach judges were informed 
as to which groups were represented. The 
judges were also told that they would receive 
one record from each group. The five clinical 
groups were: (a) involutional depression; 
(6) anxiety neurosis; (c) paranoid schizo- 
phrenia; (d) brain damage from neurosyphi- 


105 





106 Guinevere S. Chambers and Roy M. Hamlin 


lis; (€) adult mental deficiency. Except for 
serving as an economical means of communi- 
cation in designating five distinct types of 
disorders, diagnosis per se was of minimal 
importance in this study. In selecting cases, 
a major objective was to employ selection cri- 
teria which would maximize similarity of be- 
havior within groups and minimize similarity 
between groups. 

All of the individuals whose Rorschach 
protocols constitute the raw data of this in- 
vestigation had been patients at Western Psy- 
chiatric Institute and Clinic of the Univer- 
sity of Pittsburgh.’ 

A careful examination was made of case 
history material, medical findings and prog- 
ress notes of several hundred cases on file, 
bearing one of the five psychiatric diagnoses 
here considered. Patients were selected who 
met specific behavioral criteria for each group. 
For example, the case record of each of the 
neurotic patients was carefully studied to 
guarantee that the following criteria were 
met: no indications of schizophrenic person- 
ality features such as bizarre thinking or loss 
of affect; marked discomfort and incapacita- 
tion from anxiety feelings as the prominent 
features of the illness; psychiatric treatment 
administered on an outpatient basis; super- 
ficial adaptiveness shown, i.e., did not distort 
reality and attempted to adjust to social de- 
mands; a history of use of ineffectual de- 
fenses against anxiety which resulted in so- 
matic complaints; and nontest evidence of 
marked anxiety at time of testing. Thus, the 
Rorschach in no way biased selection of cases. 
All cases originally selected on this basis were 
used in the study; none was rejected after 
the study began. 

All protocols were scored and identifiea 
only by randomly selected code numbers and 
sex. The age of the patient was not recorded 
on the protocol as it might serve as a clue in 
the identification of particular groups. 

Twenty sets of five Rorschachs each were 
prepared for distribution to twenty judges. 
Each judge was to judge five records; each 
record would be judged four times yielding a 
total of 100 judgments. Selection of each rec- 


2 The five mental defectives were imbeciles, tempo- 
rarily transferred for a research project from Polk 
State School, Polk, Pennsylvania. 


ord for a given set of five was made on a 
chance basis. Letters were sent to 35 psy- 
chologists from all parts of the country who 
were known to have had at least three years 
of clinical experience which included the use 
of the Rorschach technique. Final selection 
of the 20 judges was determined by the order 
in which replies were received.* The judge’s 
task was to identify each of his five Ror- 
schachs according to which one of the five 
possible groups he felt it belonged. In addi- 
tion, each judge was asked to make four state- 
ments summarizing elements of major impor- 
tance influencing his thinking in arriving at 
his decision on each record. Judges were en- 
couraged to verbalize such “unscientific” rea- 
sons as “hunches” if they felt that the judg- 
ment had been reached by just such a method. 


Results 
Judgment Task 


The judgment task is a forced-choice situa- 
tion and judgments do not represent inde- 
pendent events, i.e., once a choice is made, 
the chances for success on the next choice are 
greater and so on throughout the series. 
Dudek (3) has constructed a frequency dis- 
tribution of scores that might be obtained by 
chance in such situations. Expected frequen- 
cies derived from this table were used and 
the chi-square technique applied to the data. 
The number of judges making two or more 
correct judgments were entered in one cell; 
those making one correct judgment in the 
second; and those having no successes in the 
third. The chi-square value of 30.59 is highly 
significant. 

From the obtained results it may be con- 
cluded that trained Rorschach workers can 
identify “blind” Rorschachs according to 
known clinical groups significantly better than 
could occur by chance. Table 1 presents the 
correct and incorrect classifications for each 


3 The writers wish to thank the following who so 
graciously served as judges: Doctors Lawrence M. 
Baker, Marianne Beran, David Cohen, Gordon 
Filmer-Bennett, Bernice Gurvich, Frederick J. Heim- 
lich, Joseph S. Herrington, Bruno Klopfer, Kate L. 
Kogan, William S. Kogan, Janet M. Lyon, Karen 
Machover, Charles F. Mason, Gerald R. Pascal, 
Zygmunt Piotrowski, Alan K. Rosenwald, James C. 
Stauffacher, John W. Whitmyre, Miss Eleanor M. 
Rose, and Major Wendell R. Wilkin. 





Cte walle flame 















Ve ett 2 alle Sri eS 


Validity of Judgments Based on “Blind” Rorschachs 107 


Table 1 
Correct and Incorrect Classifications Made by Judges 








Clinical group judged 








1 2 3 4 5 
Schizo- Brain 
Actual clinical group Depression Neurosis phrenia damage Deficiency 
1. Involutional depression 10 4 | 4 1 
2. Anxiety neurosis 3 11 4 2 0 
3. Paranoid schizophrenia 4 5 8 3 0 
4. Brain damage 1 0 7 11 | 
5. Mental deficiency 1 0 0 1 


clinical group. Five judges of the 20 had com- 
plete success. While the chi-square value 
(30.59) is highly significant, the findings are 
possibly more meaningful when considered in 
terms of probability. In such a judging situa- 
tion, only one judge in 120 would be expected 
to make five correct choices by chance. The 
probabilities for 20 judges taken by groups 
are: two or more correct judgments, an ex- 
pectancy of 5.2 judges; one correct judgment, 
7.5 judges; and no successes, 7.3 judges. For 
our judges, 16 had two or more successes; 
two had one correct judgment; and two had 
no successes. Stated in terms of total judg- 
ments correct, of the 58, 25 were made by 
five judges; 27 by nine judges; and 6 by six 
judges. 

An analysis of variance was made to deter- 
mine if the variance in judgments was a func- 
tion of (a) variation in validity among judges 
or (5) specific clinical groups being more or 
less distinct than others. To effect this analy- 
sis, successes were assigned a score of one and 
failures were entered as zero. This analysis 
follows the technique of Hoyt (6) for esti- 
mating test reliability from consistency of in- 
dividual performances upon the items of a 
test. 

A significant difference between judges 
(.01) was to be expected since it was known 
that judges varied in success from five cor- 
rect judgments to no correct judgments. The 
aspect of this analysis in which we were con- 
cerned deals with the question of whether 
there were significant differences in difficulty 
among categories. The F ratio computed on 
this source of variation was significant at the 
.01 level. To test whether the extreme mean 


18 


for the mental defective group contributed sig- 
nificantly more than the other group means to 
the obtained F ratio, the test for extreme 
mean recommended by Dixon (2) was ap- 
plied. This yielded a value significant at the 
.05 level, allowing the conclusion to be drawn 
that the mental defective group did differ sig- 
nificantly from the others in the direction of 
being more readily identifiable. 

A further analysis was made to determine 
whether specific records may have been espe- 
cially misleading. It was found that no record 
was misjudged all four times. The highest 
single occurrence of certain categories being 
consistently interchanged was for the organic 
and paranoid groups. The organic records 
were misjudged as paranoid seven times out 
of the total of 20 judgments. This is of con- 
siderable interest in that the presence of delu- 
sions was one of the selection criteria for each 
of these groups. 


Stated Reasons for Making Judgments 


The study of the reasons which the judges 
gave for their decisions was regarded as an 
exploratory procedure. From inspection of the 
reasons given by judges for making judgments 
many methods for classification are suggested. 
By considering the statements of the group 
of judges making five correct judgments (suc- 
cessful judges) in contrast to the group mak- 
ing one or no correct judgments (unsuccess- 
ful judges) certain elements of difference are 
immediately apparent. Most striking is the 
difference in the length of statements. Suc- 
cessful judges use fewer words to communi- 
cate their thinking than do the unsuccessful. 
In attempting to analyze what this difference 











108 Guinevere S. Chambers and Roy M. Hamlin 


represents in terms of the thinking employed, 
the emerging impression was that the success- 
ful judges tend to reach a higher level of 
abstraction from the raw data than do un- 
successful judges. On the basis of the ob- 
servation of the varying degrees of abstract- 
ness of the statements, two workers arrived 
at a three-point scale for classifying judges’ 
statements according to levels of abstract- 
ness. Representative statements were selected 
for each point of the scale. The scale pro- 
gressed from Level 1 to 3 in the direction of 
“distance” from the raw data. Statements in 
Level 1 were those where the evidence was 
presented in strictly Rorschach scoring terms. 
Level 2 was reserved for statements where 
evidence from many sources was cited, but no 
generalization was drawn. Level 3 included 
statements where (a) an over-all appraisal of 
the record was made; (5) a generalization 
was drawn; and (c) the reasoning was ex- 
pressed in general clinical terms rather than 
in Rorschach terminology. 


Examples of rated statements from the three- 
point scale are: Level 1 statement—‘No M”; Level 
2—paranoid—“Some loss of distance on IV, with 
an emotional reaction as if the picture were real”; 
Level 3—paranoid—‘“defective reality-testing com- 
bined with blocking and defensive over-caution.” 


Table 2 presents the results of these rat- 
ings. With this classification method it was 
possible to show by calculating a chi square 
that there was a significant difference be- 
tween the approach of the successful and un- 
successful judges. The difference found is of- 
fered only as an hypothesis to be tested 
further. Since there were five successful judges 
and six unsuccessful ones, the totaled ratings 
were converted to percentages to facilitate 
comparison. The necessary correction for the 


Table 2 


Significance of Difference Between Reasons Given by 
Successful and Unsuccessful Judges 








Per cent of responses 








by levels 
Number 
Judges of judges Levell Level2 Level3 
Successful 5 14.9 23.4 61.7 
Unsuccessful 6 22.4 474 30.2 





use of percentages was applied. It can be seen 
from the table that the statements of success- 
ful judges fell in Levels 1 and 2 less fre- 
quently than did those of the unsuccessful 
judges, and that 61% of successful judges’ 
statements were in Level 3 as contrasted to 
30% of those of unsuccessful judges. It was 
the impression of the two raters that indi- 
vidual judges from the unsuccessful group 
tended to follow one approach—Level 1, 2, 
or 3—more rigidly than did successful judges, 
although this cannot be demonstrated sta- 
tistically. Successful judges would shift from 
one level of abstractness to another in evalu- 
ating a given protocol, suggesting a greater 
degree of adaptiveness and ability to be se- 
lective on the part of these judges in deciding 
what is pertinent in a given record. 


Discussion 


The results of this study justify the conclu- 
sion that some experienced clinicians, on the 
basis of total Rorschach protocols, can iden- 
tify rather clear-cut patient groups with a de- 
gree of success better than chance. In evaluat- 
ing this conclusion, consideration should be 
given to the conditions of the experiment re- 
ported here: the choice called for was re- 
stricted to five categories, and other details 
of procedure were highly favorable to correct 
judgments. 

The judges did indeed attain a high degree 
of success in identifying the Rorschachs of 
adult imbeciles. On the other hand, they were 
right only half of the time in distinguishing 
between depression, neurosis, paranoid schizo- 
phrenia, and brain damage. This degree of 
success is certainly not impressive enough to 
justify expansive claims for the value of the 
Rorschach as a technique in identifying pa- 
tient groups. 

Of the twenty judges in the study, five 
succeeded in identifying correctly all five 
protocols submitted to them. Four judges 
missed in all, or in all but one, of the choices 
they made. The study does not justify any 
firm conclusions as to the consistent differ- 
ential ability of various judges. In all prob- 
ability, some judges are better than others. 
Tentative comparisons, offered only as pos- 
sible leads, were made between those judges 
who seemed most successful, and those who 





oe we 





See. 


Validity of Judgments Based on “Blind” Rorschachs 109 


seemed least successful. These comparisons 
suggest, only as possible questions for future 
consideration, that successful judges: (a) 
have had recent experience in interpreting 
“blind” Rorschachs, where the Rorschachs 
were actually administered by someone else; 
(5) show considerable flexibility in shifting 
from one level of interpretation to another; 
and (c) tend to be free of slavish adherence 
to textbook statements in regard to scores, 
“signs,” etc.; but rather judge in terms of 
second or third level inferences related to 
over-all concepts of psychopathology. None 
of these suggestions are actually tested in the 
experiment reported. 


Summary 


This study has investigated the validity of 
judgments made by clinicians on the basis of 
“blind” Rorschach records and an analysis 
was made of the thought processes of cli- 
nicians when making these judgments. 

Each of 20 clinicians, experienced in the 
use of the Rorschach technique, was given 
the task of identifying five Rorschach proto- 
cols according to clinical group. The groups 
were: (a) involutional depression; (6) para- 
noid schizophrenia; (c) anxiety neurosis; (d) 
brain damage due to syphilis; and (e) adult 
mental deficiency. The judges were told which 
five clinical groups were represented and that 
they would receive one Rorschach record from 
each group. The secondary phase of the 
study, concerning the thought processes of 
the judges, was a pilot investigation. Each 
judge was asked to make four statements 
summarizing elements of major importance 
influencing his thinking in arriving at the 
decision on each record. A method for clas- 
sifying these statements was devised by com- 
paring statements of the five most successful 
judges with those of the six least successful 
judges. 


The following major conclusions were 


reached with reference to this sample of Ror- 
schach records and these judges: 

1. Some clinicians can identify “blind” Ror- 
schach records according to clinical groups, 
when cases are selected for homogeneity and 
groups are limited. Out of 100 possible judg- 
ments, clinicians were correct 58 times; five 
judges contributed 25 of the correct judg- 
ments; nine judges, 27; and the remaining six 
judges, 6. 

2. Rorschachs of mental defectives can be 
identified in 90% of the cases. Judges were 
correct 51% of the time in distinguishing be- 
tween depression, neurosis, paranoid schizo- 
tihrenia, and brain damage. 
~ 3. The method proposed for analyzing the 
thought processes of clinicians working with 
the Rorschach indicated that there is a sig- 
nificant difference between the approaches of 
successful and unsuccessful judges. 


Received July 19, 1956. 


References 


1. Benjamin, J. D., & Ebaugh, F. G. The diagnostic 
validity of the Rorschach Test. Amer. J. Psy- 
chiat., 1938, 94, 1163-78. 

. Dixon, W. J. Analysis of extreme values. Ann 
math. Statist., 1950, 21, 488-506. 

3. Dudek, F. J. Determining “chance success” when 

a specific number of items are sorted into dis 
crete categories. J. consult. Psychol., 1952, 16 
251-256. 

4. Hamlin, R. M. The clinician as judge: implica 
tions of a series of studies. J. consult. Psychol, 
1954, 18, 233-238. 

. Hertz, Marguerite R., & Rubenstein, B. B. A com 
parison of three “blind” Rorschach analyses 
Amer J. Orthopsychiat., 1939, 9, 295-314. 

6. Hoyt, C. Test reliability obtained by analysis of 

variance. Psychometrika, 1941, 6, 153-160. 

7. Hunt, W. A. The future of diagnostic testing in 
clinical psychology. J. clin. Psychol., 1946, 2 
311-317. 

8. Klopfer, B., Ainsworth, Mary, Klopfer, G., & 
Holt, R. Developments in the Rorschach tech 
nique. Yonkers, N. Y.: World Book Co., 1954 

9. Symposium. The case of Gregor: Interpretation of 
test data. Rorschach Res. Exch., 1949, 13 
433-468. 


nN 


wm 








Journal of Consulting Psychology 
Vol. 21, No, 2, 1957 


A Comparison of Client and Therapist Ratings 
on Two Psychotherapeutic Variables’ 


Malcolm H. Robertson 


Purdue University 


This study was designed to determine the 
extent of agreement between clients and thera- 
pists about changes during psychotherapy. 
Howard and Kelly (1) have suggested that 
anticipation of change may lead an _ indi- 
vidual to exaggerate any actual change. Con- 
sequently, observers may note little if any be- 
havioral change, but the individual reports 
far more change because he interprets what 
has occurred in terms of his anticipation of 
greater change. Thus, we would expect agree- 
ment between client and therapist when the 
client indicates no change. We would not 
expect agreement when the client indicates 
change. However, if both acknowledge that a 
change has occurred, we would expect them 
to agree on whether they are satisfied with 
the change. Where both acknowledge that a 
change has not occurred, we would not ex- 
pect them to agree on whether they are 
satisfied. 

A questionnaire of 12 statements related to 
changes in feelings or interpersonal behavior 
was administered to 23 clients and 16 thera- 
pists from two mental hygiene clinics. Each 
subject answered yes or no to whether there 
had been a change in a certain direction. The 
responses of each client and therapist pair 
were compared by means of chi square. The 
data were collected while the clients were still 
receiving psychotherapy. 


1An extended report of this study may be ob- 
tained without charge from Malcolm H. Robertson, 
Student Counseling Service, University, Mississippi, 
or for a fee from the American Documentation In- 
stitute. Order Document No. 5104, remitting $1.25 
for microfilm or $1.25 for photocopies. 


The results show that when clients indicate 
no specific change, there is a strong trend, 
though not statistically significant, for thera- 
pists to agree with them. When clients indi- 
cate a specific change, there is no trend to- 
ward agreement. This finding is consistent 
with the idea that clients in anticipating much 
change may be led to exaggerate any slight 
change. Where no change has occurred, such 
exaggeration effects would not be expected. 
Moreover, when clients and therapists agree 
that certain changes have occurred, there is 
also agreement (p = .05) regarding satisfac- 
tion over these changes. When they agree that 
certain changes have not taken place, there is 
no trend toward agreement about the satis- 
faction over the absence of these changes. 
One explanation for this finding might be 
that as change occurs, both client and thera- 
pist may note the actual effects of this change 
in the client’s adjustment. If no change oc- 
curs, their evaluations may be based on dif- 
ferent anticipations. 

We realize that a complete evaluation of 
such changes in psychotherapy should take 
into account not only the degree of change 
but also some estimate of the permanency 
of change. 


Brief Report. 
Received December 24, 1956. 


Reference 


1. Howard, A., & Kelly, G. A theoretical approach 
to psychological movement. J. abnorm. soc. 
Psychol., 1954, 49, 399-404. 


110 


ok tI NSS ac tie ba 


RE BAS 05 Rc PAM +O 0 


Journal 7 oom Psychology 
Vol. 21, No. 2, 1987 


Studies in Fantasy—Daydreaming Frequency 
and Rorschach Scoring Categories’ 


Horace A. Page’ 


University of Wisconsin 


Fantasy has traditionally assumed a role of 
importance in tests of a projective nature. 
The Rorschach has figured prominently among 
those tests which are considered capable of 
eliciting information about an _ individual’s 
tendency to engage in fantasy. In this study, 
the interest was directed to a consideration 
of the relationship between various formal 
aspects of Rorschach test performance and 
an independent assessment of the frequency 
of daydreaming behavior. 

Special attention is addressed to Rorschach 
movement response. Writers have typically 
stressed the relationship between movement 
responses, particularly human movement or 
M, and the extent to which S is free to uti- 
lize his imaginal processes. Klopfer makes 
this point in Developments in the Rorschach 
Technique (6, p. 256), while Beck makes an 
even stronger statement in the second volume 
of Rorschach’s Test where he states, “.. . 
Rorschach’s penetration to the essence of the 
movement response (M) as fantasy activity 
is his greatest achievement—original as is his 
contribution to the many-sided instrument he 
fashioned. His M opens to investigation a 
sector of personality that, effective as it is in 
determining an individual’s course, has been 
elusive of efforts to study it objectively” (1, 
p. 22). In this research movement responses 
are emphasized, but other formal character- 


1 This paper was presented at the 1956 meeting of 
the Midwestern Psychological Association 

2 The author is indebted to Miss Gloria Markowitz 
and Mr. Conrad Nuthmann for their assistance in 
analysis of the data and to Dr. Richard M. Lundy 
for his helpful comments. This research was sup- 
ported in part by funds supplied by the Research 
Committee of the University of Wisconsin Graduate 
School. 


istics of Rorschach performance were also 
considered. 


Method 


Assessment of daydreaming behavior. A 
Fantasy Scale of 201 items was administered 
to the Ss. Items in the scale represented dif- 
ferent imaginal themes, the majority of which 
were obtained from a large number of anony- 
mous reports of personally experienced day- 
dreams, submitted by 150 male and female 
college students. Table 1 presents some of the 
items from the scale. The Ss were asked to 
specify the frequency with which each fan- 
tasy was experienced on a five-point scale 
which ranged from 1 (Never experienced such 
a fantasy) to 5 (Very Frequently, experi 
enced such a fantasy once a day or more fre- 
quently). A Productivity Score was obtained 
by simply summing the values assigned by S 
to those items which referred to various fan- 
tasies. The inventory was administered to 
groups, Ss recording their answers on a multi- 
ple-choice IBM form. 

Subjects. Eighty sophomore, junior, and 
senior women obtained from an introductory 
psychology course at the University of Wis- 
consin were administered the Fantasy scale. 
Fantasy scales were scored for productivity, 
and Ss in the upper and lower 25% of the 
distribution were selected and designated as 
the high and low daydreaming groups. Table 2 
shows that these groups are relatively com- 
parable in such characteristics as age and col- 
lege class with the exception of a greater 
heterogeneity in the ages of the high fre- 
quency group. 

Rorschach test. The Ss in the high and low 
daydreaming groups were administered the 


111 








112 Horace 


Table 1 


Illustrative Items from the Fantasy Scale 














34. In my fantasies, I am at a social gathering where 
I have a very good time. 

43. In my fantasies, someone wise and understanding 
solves my problems. 

84. In my fantasies, I picture what it would be like 
to be different animals. 

104. I daydream that I defeat a rival and win out in a 
romance. 

128. In my fantasies, I am a machine-gun and mow 
down enemy troups. 

160. In my fantasies, I picture what it would be like 
if I were the only one left on earth. 

168. I imagine that I am in a far away place where I 
have nothing to do but bask in the sun, eat good 
food, and enjoy life. 

194. In my fantasies, I imagine what it would be like 
if T were physically disabled. 








Group Rorschach using the slides and forms 
prepared by Harrower (4). From 10 to 15 
Ss were tested at one time under conditions 
which assured similarity of position in rela- 
tion to the screen and which reduced the pos- 
sibility of communication between Ss. 
Rorschachs were scored by the author using 
the system described by Klopfer and Kelley 
(7). The reliability of the author’s scoring 
behavior was assessed in another study and 
had been considered acceptable (9). In addi- 
tion to these scoring procedures, a Rorschach 
content analysis for anxiety and hostility was 
made as described by Elizur (3), and move- 
ment responses received an additional special 
scoring. For this latter evaluation, movement 
responses were scored only when Ss clearly 
verbalized movement in their initial exposure 


Table 2 


Age and College Class of Subjects in the High and 
Low Daydreaming Groups 








High Low CR ODifi. 





Age Mean 19.60 19.15 1.15 NS 
oe 1.43 85 2.22 <.05 
Range 18-24 18-21 


College Mean 2.55 2.20 1.32 NS 
year* o .67 ae tae NS 
Range 2-4 1-4 





* Freshman standing = 1, sophomore = 2, junior = 3, and 
senior year = 4. 


A. Page 


to the blots. Popular percepts as described by 
Klopfer and Kelley (7) were not included. It 
was felt that such movement scoring criteria 
might provide a measure which would be 
more sensitive to differences in daydreaming 
frequency. 


Results 


Initially a test was made of the absolute 
response number (R) obtained by the two 
groups. A mean R of 38.03 was obtained for 
the high group and a mean R of 28.21 was 
observed in the low group, a difference sig- 
nificant beyond the .05 level. This finding 
made it necessary to account for the relation- 
ship of all measures to response number prior 
to testing their discriminative power between 
the two daydreaming groups. Table 3 con- 
tains the results of analysis of variance tests 
between high and low daydreaming Ss who 
were matched on the basis of Rorschach pro- 
ductivity. M, combined movement, F, and 
animal responses were considered in this fash- 
ion. None of the tests reveal significant Jif- 
ferences attributable to daydreaming fre- 
quency. The importance of controlling for R 


Table 3 


Analysis of Variance Tests of Ten High and Ten 
Low Daydreaming Rorschachs Matched 
for Response Number 








Mean Mean 
High Low Difi. 


Scoring category 





Human Movement 
Daydreaming Groups 74 6.9 — 
Productivity (R) — 
Daydreaming X Productivity - 


Combined Movement 
(M + FM + m) 


Daydreaming Groups 14.3 12.9 — 
Productivity (R) - 
Daydreaming X Productivity — 


Form Determined Responses 


Daydreaming Groups 3.1 3.5 — 
Productivity (R) OS 
Daydreaming X Productivity — 


Anima] Responses 


Daydreaming Groups 13.6 12.3 - 
Productivity (R) 05 
Daydreaming X Productivity — 





seat 


+ tte 


Ack eve 





Se ee a 





Table 4 


Results of ¢ Tests Between High and Low 
Frequency Daydreaming Groups 





Mean Mean 
High Low P of 
Rorschach Measure Group Group Diff. 


Response Number (R) 38.03 28.21 


<.05 
Total Words 313.6 296.8 — 
Words per Response 11.1 12.5 - 
Anxiety Content 7.7 5.4 <.10 
Hostility Content 6.9 5.9 -- 


is evidenced in the significant productivity 
effects seen for F and animal responses. 
Table 4 includes ¢ tests between the high 
and low groups. Although the significant dif- 
ference in response number occurs, similar 
differences are not noted for either the total 
number of words or the average number of 
words per response. The Elizur anxiety and 
hostility ratings are in a direction indicative 
of a greater incidence of these characteristics 
in the high frequency daydreaming group, but 
these differences are not statistically reliable. 
An analysis of the results of the special 
scoring procedure for movement responses is 
presented in Table 5. When scored with a 
somewhat more demanding criterion, it will 
be noted that the incidence of M responses 
does differ in the two groups, with the fre- 
quent daydreaming Ss showing a greater 


Table 5 


Movement Responses (Special Scoring) for High 
and Low Daydreaming Groups 


Movement Mean Mean 

Response High Low P of 

Categories Group Group Diff. 
M 3.71 1.76 <.05 
FM 2.05 1.70 <.10 
m 1.70 1.00 no test 
M in Hd 70 12 no test 
FM in Ad .29 .06 no test 
M, FM, or M 
in unusual location 
(dr, dd, s) 1.41 53 no test 
Minus Responses 

no test 


(M, FM, or m) 65 18 














Daydreaming Frequencies and Rorschach Scores 113 


mean number of such percepts. The low in- 
cidence of responses in some of the other 
categories considered precluded the applica- 
tion of statistical tests, but it is of note that 
the high daydreaming group shows a greater 
number of movement responses which appear 
in human details, are located in unusual blot 
areas, and are more typically of poor or 
minus form quality. 


Discussion 


A variety of tests have been made of the 
relations of certain aspects of performance on 
the Rorschach to the frequency of daydream- 
ing behavior as indicated by a “self-report’ 
instrument. It is of interest that the hypothe- 
ses regarding movement responses are, to some 
extent, substantiated. A significant difference 
was obtained between the two daydreaming 
groups in the incidence of human movement 
responses when popular percepts were elimi- 
nated and a more rigorous scoring criterion 
was introduced. These findings provide sup 
port for the notion that the tendency to per- 
ceive movement in the Rorschach is associ 
ated with fantasy activity. In addition, there 
are qualitative indications suggestive of a 
tendency for the frequent daydreamer to per- 
ceive movement in partial human figures, in 
unusual locations, and with form of lower or 
minus quality. 

Although not necessarily predicted by Ror- 
schach theory, it is of interest to note that the 
two groups differ in number of responses. To 
have ignored this finding would have resulted 
in the determination of differences in other 
scoring categories. As Cronbach has sug- 
gested, however, it is more parsimonious to 
assume that R is playing a causal role (2). 
The number of words and the number of 
words per response did not differentiate the 
two groups. In a somewhat similar analysis 
of TAT data, daydreaming frequency was not 
found to be related to the amount of verbali- 
zation (8). 

In conclusion, it can be suggested that there 
are some relationships between daydreaming 
behavior and performance on the Rorschach 
test. These data are consistent with Rorschach 
theory in the sense that positive findings are 
noted for movement responses. These results, 
however, are not of a magnitude which would 


114 Horace A. Page 


warrant their employment in the interpreta- 
tion of the individual protocol. In support of 
the Rorschach it should be recognized that 
the Fantasy scale as a self-report measure is 
subject to certain limitations. 


Received July 6, 1956. 


References 


1. Beck, S. Rorschach’s test. Vol. Il. New York: 
Grune & Stratton, 1949. 

2. Cronbach, L. J. Statistical methods applied to 
Rorschach scores, a review. Psychol. Bull., 
1949, 46, 393-429. 

3. Elizur, A. Content analysis of the Rorschach with 
regard to anxiety and hostility. Rorschach 
Res. Exch., 1949, 3, 247-284. 


4. 


Harrower, M. R., & Steiner, M. E. Large scale 
Rorschach techniques. Springfield, Ill.: Charles 
C Thomas, 1945. 


. Kendall, M. G. Rank correlation methods. Lon- 


don: Griffin, 1948. 


. Klopfer, B., & Kelley, D. Mc. The Rorschach 


technique. Yonkers, N. Y.: World Book Co., 
1942. 


. Klopfer, B., Ainsworth, Mary D., Klopfer, W. G., 


& Holt, R. R. Developments in the Rorschach 
techniques. Vol. I. Yonkers, N. Y.: World 
Book Co., 1954. 


. Page, H. A. Studies in fantasy, daydreaming and 


the TAT. Amer. Psychologist, 1956, 11, 392. 
(Abstract) 


. Snyder, W. U. (Ed.) Group report of a program 


of research in psychotherapy. State College, 
Pa.: School of Education, Pennsylvania State 
University, 1953. 


DOR EIA ARE! 


Nat 


GPSS 








»- 








> 





Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 





Levels of Prediction from the TAT 


Seymour Fisher and Robert B. Morton 
VA Hospital, Houston, Texas 


The literature is replete with studies and 
speculations concerning the kinds of phe- 
nomena which can be predicted from projec- 
tive test responses. Summaries of this mate- 
rial are available elsewhere (4, 7, 9). It is 
somewhat confusing to examine the evidence 
concerning the validity of projective tests be- 
cause it is so contradictory. One investigator 
reports great success in predicting various be- 
haviors from the Rorschach or TAT and an- 
other investigator reports completely negative 
results. These divergences are in many in- 
stances due to obvious differences in subject 
populations used and to variations in pro- 
cedure. Thus, it is clearly easier to predict in 
a heterogeneous group than in one that is 
highly selected. Likewise, it is clear that some 
projective test indices are more cleverly de- 
vised than others and therefore give more 
valid results. However, aside from such ob- 
vious factors, there are important differences 
in results which seem to be a function of the 
area of behavior one attempts to predict. 

Kagan and Mussen (4) have pointed out 
that past studies have found less significant 
relationships between fantasy and behavior 
that is prohibited or punished in the individu- 
al’s social milieu than between fantasy and 
behavior that is culturally sanctioned. This 
point is illustrated by the fact that various 
researchers (1, 5, 8) have not found signifi- 
cant relationships between amount of aggres- 
sive TAT or doll-play fantasy and degree of 
overt aggression among subjects from a mid- 
dle-class milieu. But Mussen and Naylor (7) 
demonstrated a significant link between TAT 
aggressive fantasy and overt aggressive be- 
havior in a group of lower-class boys for 
whom this sort of behavior is more likely to 
be approved. Apparently, if the individual is 
set to conceal certain aspects of his behavior, 





115 


this decreases the correlation of such behav- 
ior with logically related areas of fantasy. 

In an analogous vein, might one not an- 
ticipate differences in fantasy vs. behavior re- 
lationships as a function of other behavioral 
dimensions? Is verbal behavior easier to pre- 
dict from fantasy than nonverbal behavior? 
Is behavior over which the individual has no 
conscious control easier to predict than be- 
havior which he can consciously influence? 
Are certain kinds of verbal behavior less diffi- 
cult to predict than others? Are behaviors 
that are usually conceptualized in purely 
physiological terms more or less predictable 
than behaviors occurring at the level of ver- 
balization and striate muscular response? The 
present study represents an attempt to answer 
some of these questions. More specifically, 
the intent was to determine the relationships 
of two different TAT scores to a whole range 
of behaviors which had been measured in a 
population of individuals who were hospital- 
ized for tuberculosis. 


Methods 
Behavioral Measures 


The opportunity for examining such issues 
was provided in terms of a body of data 
which was collected by Moran, Fairweather, 
Morton, et al. (2, 6). As part of a large-scale 
study of the adjustment of patients with tu- 
berculosis, they obtained a wide variety of 
measures on a group of 140 male veterans 
who were receiving treatment for tuberculosis 
in a Veteran’s Administration hospital. The 
methods used in selecting this population, 
and the behavioral measures obtained, have 
already been described in detail elsewhere 
(2). Therefore, they will only be briefly 
listed and summarized. These measures were 
of the following order: 


116 Seymour Fisher and Robert B. Morton 


A. Verbal responses from each patient concerning 
his immediate attitudes toward the hospital situa- 
tion. There were various items which touched on his 
feelings about ward regulations, ward personnel, and 
other patients. The items were expressed in terms of 
statements with which the patient could agree or 
disagree. Scoring of responses was based on an 
a priori judgment as to whether agreement or dis- 
agreement with a particular item represented an 
adaptive or maladaptive attitude toward the hos- 
pital situation. By and large, adaptive attitudes were 
equated with wanting to conform to regulations and 
expressing positive reactions toward personnel and 
other patients. 

B. Verbal responses to a series of questions con- 
cerning prehospital adjustment. These questions con- 
cerned such a range of things as school attendance, 
number of close friends, and social achievement. An- 
swers to questions were scored in terms of a priori 
judgments concerning which of the reported behav- 
iors are adaptive vs. maladaptive. Thus, getting into 
fights, having few friends, and belonging to few or- 
ganizations are examples of behaviors which would 
be scored in the maladaptive direction. 

C. Verbal responses to a series of questions con- 
cerning the characteristics of the patient’s original 
family orientation. The questions diversely concerned 
such topics as parents’ economic status, parents’ edu- 
cational status, and parents’ mode of disciplining 
children. Answers were scored relative to a priori 
standards of what is adaptive and maladaptive. Illus- 
tratively, adaptive scores were given for reports of 
high parental economic status and high parental oc- 
cupational attainment. 

D. Verbal responses to a series of questions con- 
cerning the status of the patient’s current adjust- 
ments outside the hospital situation. Patients were 
questioned concerning such issues as their antici- 
pated employability when they recovered from their 
illness and attitudes of their dependents toward their 
hospitalization. Responses were considered to be 
adaptive if, for example, they indicated that the im- 
mediate family was well taken care of economically 
or that the family had a positive accepting attitude 
toward the patient’s hospitalization. 

E. Ratings of the actual ward behavior of each 
patient. Two aides and two nurses who had usually 
attended the patient for several months prior to 
rating independently evaluated him on a 64-item 
scale. The items in the scale related to conformance 
to regulations (e.g., staying in bed, covering coughs), 
relations with ward personnel (e.g., how demanding 
or convivial), and relations with other patients (e.g., 
hazing and disturbing others). Each rating item of- 
fered two alternatives, one of which was judged on 
an a priori basis to be adaptive. The score for a spe- 
cific item was the number of times the adaptive al- 
ternative was checked by all four raters. Once again, 
adaptive response was considered to be in the direc- 
tion of obeying regulations and getting along with 
others. 

F. The ability of the patient to remain in the hos- 
pital for the full period required to complete treat- 


ment, Some patients find it difficult to tolerate the 
demands made upon them as the result of living in 
a tuberculosis treatment ward. Such patients will 
leave the hospital despite the fact that they are still 
sick and will seriously endanger the health of mem- 
bers of their family if they return home prematurely. 
It is obviously a maladaptive response to leave the 
hospital in this fashion and it provides a clear-cut 
index of poor adjustment to the hospital situation. 

G. Rate of recovery from the tubercular infection. 
This rate of recovery variable was defined in terms 
of the length of time required for the patient to con- 
vert from positive to negative bacteriology. Each 
patient’s sputum and gastric contents are routinely 
checked at intervals to determine their bacteriologi- 
cal status. A negative report means the absence of 
tubercle bacilli in the laboratory specimen. Within 
the context of this study, the patient was considered 
to be “converted” from positive to negative bacterio- 
logically after five successive negative laboratory 
reports. Only 46 patients were involved in this phase 
of the study. They were subjects who had been care- 
fully selected to participate in a national research on 
the effectiveness of chemotherapy. Patients who con- 
verted bacteriologically within five to eight months 
were designated a fast recovery group and patients 
who did not convert within this time period were 
designated as a slow recovery group. 


The various measures listed above provided 
samples of many different kinds of behavior 
of a group of men who were of lower- to 
lower-midale socioeconomic status; who could 
be considered part of the normal population 
in that they did not exhibit an unusual fre- 
quency of personality disorders; and who at 
the time evaluated were living in a similar 
standardized environment. 

For the purposes of the present study a 
number of individual measures were derived 
from the total available array. These indi- 
vidual measures were selected so as to con- 
form to a particular conceptual scheme. It 
was postulated that any behavior to be pre- 
dicted may be categorized on a continuum 
having to do with how easily the individual is 
consciously able to camouflage that behavior 
so as to make it appear socially acceptable or 
how motivated he is to do so. In illustration, 
there might be cited at one extreme the rate 
of bacteriological conversion or the long-term 
ward behavior of a tubercular patient who is 
intensively observed by nurses and aides. A 
patient cannot consciously influence his rate 
of bacteriological conversion. It is doubtful 
also that a patient could for long dissimilate 
sufficiently to prevent nurses and aides who 





ARIE eT os 


CERISE 

















, se poe ae 2 & eee FP 


CP 





ess 5 ee) 





Levels of Prediction from the TAT 117 


observed him intimately from detecting his 
basic modes of reaction to the ward situation. 
At the other extreme are behaviors which 
simply involve the patient’s own verbal de- 
scriptions of things. The patient is then free 
to shape and distort his descriptions within a 
wide range of possibilities that fit his needs. 
However, within this area of verbal report, 
there may be distinguished those descriptions 
which have important ego-involving signifi- 
cance to the patient and which might there- 
fore be twisted by him in a self-enhancing 
manner. At quite a different level are verbal 
reports regarding matters that have relatively 
limited emotional significance and which the 
patient has only minor need to distort in 
terms of his own self-protective attitudes. 
Thus, a patient might self-protectively warp 
his answers concerning how much he likes 
the hospital or his ward physician, but be 
without temptation to do so in reply to sim- 
ple factual questions concerning how much 
he participates in sports or the kind of rec- 
reation he enjoys. In line with his conceptual 
scheme, the following behaviors were selected 
for study: 


A. Those behaviors which are relatively difficult 
for the patient to influence or to camouflage in a 
self-protective, socially approved direction. Three 
measures are included in this category. 

1. Rate of bacteriological conversion. 

2. Evaluations made by nurses and aides of the 
patient’s actual overt adaptation to the ward situa- 
tion. 

3. The differentiation between patients who leave 
the hospital prematurely before treatment is com- 
plete and those who remain for an optimum treat- 
ment period. This differentiation involved 26 of 140 
patients who left prematurely and a comparison 
group of 45 patients randomly selected who re- 
mained for full treatment. 

B. Verbal reports concerning issues which have 
relatively low ego involvement for the patient and 
which are likely not to be camouflaged. Two meas- 
ures are included here: 

1. A group of thirteen items concerning the pa- 
tient’s relationships with peers from childhood to the 
period just preceding his hospitalization. The items 
mainly refer to such factors as the number and 
kinds of social organizations in which membership 
was held; preferred forms of recreation with friends; 
number of friends; number of friendships formed in 
the army; and degree to which army friendships 
have carried over into civilian life. The scoring of 
the answers is intended to evaluate how actively and 
fully the individual has interacted with his peers. 
Two-thirds of the items refer to childhood and 


adolescent behavior rather than to adult behavior. 
It was considered that these questions would be 
relatively nonthreatening because their apparent in- 
tent was vague and because most of them were 
phrased in a bland innocuous fashion. 

2. A group of three items concerning the occupa- 
tional level of the patient’s parents. These were ques- 
tions that simply requested information concerning 
the type of work done by the parents and how 
steadily they were employed. 

C. Verbal reports concerning issues that have rela- 
tively high ego-involving significance and are likely 
to result in camouflage and distortion. Four measures 
are embraced by this category: 

1. Forty-five questions concerning how the pa- 
tient feels about various aspects of his immediate 
situation in the hospital. These questions required 
the patient to express to the interviewer his opinions 
about his physician, the nurses, the aides, the quality 
of food served, and so forth. It was considered that 
such questions tapped opinions which the average 
patient would be embarrassed to voice openly. Fo: 
example, if he disapproved of his physician or 
thought the nurses were inefficient he might be 
anxious that an open expression of such feeling 
would get him into trouble. Thus, his tendency 
would be to play down the negative aspects of his 
attitudes. 

2. A cluster of three questions relating to how the 
patient currently perceives the attitudes of significant 
figures outside the hospital toward his hospitaliza- 
tion. The questions requested information as to how 
alarmed his dependents were by his illness; whether 
his dependents favored or did not favor his initial 
hospitalization; and whether any person significant 
to him was pressuring him to hurry up and leave 
the hospital. It seemed likely that such questions 
would probe into areas of high tension (eg., guilt 
about the plight of dependents) and elicit defensive 
disguised verbal replies. 

3. A cluster of nine questions that refer to how 
well the patient’s parents supplied him with a stable 
environment with consistent rules and limits. These 
questions concerned how much time the parents 
spent at home, the kind of discipline they adminis- 
tered, and whether they were given to unusual drink- 
ing or drug addiction. It was assumed that such 
questions would elicit defensive responses from pa- 
tients because they touched on a deeply personal 
aspect of one’s relationship to his parents (viz., 
discipline and punishment) and because they re- 
quired reports about socially highly disapproved as- 
pects of behavior (viz., drinking and drug behavior) 
of the parents. 

4. A group of ten questions regarding the patient’s 
ability to get along with authority figures from the 
time of grade school to the present. These questions 
variously inquired concerning such issues as past ar- 
rests, failures in school, and failures in the army. 
Obviously, re would be marked temptation to 
distort ans to questions that had the intent of 
extracting so much negative information concerning 
one’s past. 





118 Seymour Fisher and Robert B. Morton 


Fantasy Measures 


Ten cards on the TAT were administered 
to each patient: 1, 2, 4, 12M, 8BM, 6BM, 
17BM, 18BM, 6GF, 7BM. In choosing TAT 
measures which would be most likely to pre- 
dict a wide range of behavior of tubercular 
patients, special consideration was given to 
the hospital situation in which this behavior 
occurs. The average patient who comes to be 
treated for tuberculosis in the VA hospital 
setting in which the present study was carried 
out is required to adjust to a very new way of 
life. He enters into a situation in which he 
has to change radically many of his previous 
ways of doing things. He has to endure a va- 
riety of procedures which are restraining and 
frustrating. He has to give up many of his im- 
mediate life goals with the hope that in so 
doing he will better his long-term prospects. 
In this setting his physician becomes a cen- 
tral figure whose decisions take on magnified 
importance. These decisions determine how 
much freedom he has in the hospital and of 
course are always fraught with implications 
concerning how well the treatment is pro- 
gressing. The patient comes to attach exag- 
gerated importance to the words and gestures 
of his physician; and his mood tone may 
fluctuate up and down as he ascribes first 
one and then another significance to such 
words and gestures. 

These special features of the hospital treat- 
ment situation suggested that two kinds of 
fantasy variables might be particularly perti- 
nent as predictors of the behaviors of the tu- 
berculosis patient: 


1. The hospital situation is one which clearly re- 
quires an unusual degree of immediate passivity from 
the patient. Yet, it also requires a long-term active 
or aspiring attitude in the sense of being willing to 
put up with immediate frustrations in anticipation 
of attaining basic future goals. It therefore seemed 
logical that a fantasy measure concerned with 
achievement and aspiration should be related to as- 
pects of the tubercular patient’s behavior. So many 
of the tubercular patient’s problems seem to cluster 
about issues of activity vs. passivity that one would 
expect his fantasies in this area to be meaningfully 
linked to his patterns of response. The TAT index 
which was selected to get at this dimension of fan- 
tasy is a measure developed for a previous study 
(3). It is an Achievement score based on the num- 
ber of instances in which story characters are de- 
scribed as having high aspirations, engaging in un- 


usually hard work, attaining financial success, doing 
an excellent job, being obligated to great effort, or 
adopting a laudable aim. The score is simply a 
count of all instances in which such characteristics 
are depicted. An interscorer rank-order reliability of 
.62 was obtained on ten independently scored TAT 
records. 

2. The fact that the tubercular patient finds him- 
self in a situation where certain authority figures 
(viz., physician and nurse) take on great prominence 
in his life suggested that his fantasies about father 
figures and mother figures would have particular 
relevance. Since so many decisions about his treat- 
ment and his activities are in the hands of his phy- 
sician and nurse, one would assume that his fan- 
tasies about the male authority figure and the female 
authority figure would have a significant bearing on 
many of his responses in the hospital situation. The 
TAT measure used to get at this dimension was also 
devised for an earlier study (3). It is based on an 
analysis of the stories given to Cards 2, 6BM, and 
7BM, which are cards that seem frequently to evoke 
the maximum number of themes concerning mother 
and father figures. The measure is concerned with 
the definiteness or clarity of the parental images that 
are projected into these TAT pictures. In a previous 
study (3) it was shown that the definiteness of such 
images is significantly related to certain basic per- 
sonality characteristics. The theory underlying the 
measure is that parents who stand for something 
definite (whether it is positive or negative) provide 
their children with well-defined values; whereas par- 
ents who are weak or fluctuating in their position on 
various issues leave their children without clear-cut 
standards of judgment. A parental figure in any TAT 
story was scored as definite if the story described the 
parent in a clearly domineering or unfriendly role or 
in a clearly favorable or friendly role. A parental 
figure was scored as vague or weak if he was de- 
scribed as inadequate or if the story data were too 
fragmentary or unclear to permit classification in 
the “definite” category. An interjudge reliability of 
81 on the dichotomous scores was obtained on 
twenty independently scored protocols. A _ special 
adaptation of this basic scoring procedure was uti- 
lized. It was assumed that the patient would usually 
project images of both the father and mother fig- 
ures in his reactions to Card 2. Each image that was 
definite was given a score of +1. If either image 
was unclear or weak, it was given a score of — 1. 
Thus, a maximum definiteness score of +2 and a 
minimum definiteness score of —2 could be earned 
for Card 2. Either +1 or —1 was scored for Card 
6BM and also for Card 7BM. The subject could, 
therefore, obtain a total score ranging from +4 to 
— 4, 


No predictions were made concerning the 
direction of differences that might be obtained 
from the Aspiration score and the Parental 
Definiteness score. It was simply hypothesized 
that both scores would tap fantasy areas that 








a TS, 



































Levels of Prediction from the TAT 119 
Table 1 
Chi-Square Tests of Differences in Various Behaviors Between Those High and Those Low in TAT 
Parental Definiteness and Those High and Low in TAT Achievement 
Parental Definiteness High Achievement 
Group w/ Group w/ 
higher Level of higher Level of 
Behaviors score x? significance score x? significance 
Verbal Behavior: 

Attitude toward 

hospital situation - 
Family attitude 

toward hospitalization 
Goodness of parental 

behavior — 
Difficulties with authority — - 
Parent occupational status H 3.7 06 - 
Response to peers H 4.3 .05—.06 H 79 01 

Nonverbal Behavior : 

Rated ward behavior 4.0 05~.02 
AMA vs. MHB* MHB 5.5 .02-.01 MHB 3.8 05 
Rate of bacteriological 

conversion L 4.9 .05-.02 





* AMA = Group leaving hospital against medical advice. 


MHB = Group remaining in hospital for maximum hospital benefit. 


significantly influence behavior in the unique 
tuberculosis treatment situation. But more 
specifically, it was hypothesized that the fan- 
tasy scores should be most significantly linked 
with behaviors that the subject cannot con- 
sciously camouflage or which he would have 
little motivation to camouflage. 


Results 


The results shown in Table 1 tend to cor- 
roborate the expectations underlying this 
study. The Parental Definiteness score sig- 
nificantly differentiates subjects who are 
above the median and below the median in 
four behavior areas. Individuals who have the 
most definite parental images are slower in 
their rate of bacteriological conversion, less 
likely to leave the hospital before fully 
treated, more likely to describe themselves 
as having full, satisfying relationships with 
peers, and more likely to describe their par- 
ents as being of high occupational status. 
What is most important about these signifi- 
cant differentiations is that they all involve 
behaviors which are considered to be difficult 
to camouflage or of a sort that one would 
have little motivation to “dress up.” None of 


the behaviors which have ego-involving im- 
port and which are subject to conscious 
manipulation could be predicted from the 
Parenta! Definiteness score. 

The results shown in Table 1 concerning 
the Achievement score are in the same direc- 
tion. Of three significant differences obtained 
none fall outside of those behaviors which are 
considered least likely to be dissimulated. 
Those subjects with the higher Achievement 
scores are more likely to remain in the hos- 
pital for their full treatment; they have a 
greater probability of being rated by nurses 
and aides as adjusting well on the ward; and 
they are more likely to describe themselves 
as having satisfying involvement with their 
peers. 

There is some temptation to try to account 
for the specific direction of the significant 
differences obtained. But this would be a long 
involved task and is not pertinent to the main 
objective of this paper which was to demon- 
strate that certain modes of behavior are 
more directly linked with fantasy (whether 
positively or negatively) than are other 
modes. The pattern of results suggests that 
fantasy and behavior are not two different 





120 Seymour Fisher and Robert B. Morton 


realms, but rather that they are intimately 
connected. It would appear that the appro- 
priate question is not whether fantasy influ- 
ences behavior, but rather whether behavior 
conceptualized in certain ways can be pre- 
dicted from measures derived from specific 
defined conceptualizations of fantasy data. 


Summary 


The purpose of the study was to relate two 
measures of fantasy derived from the TAT 
to a variety of behavioral measures obtained 
from a group of persons hospitalized for treat- 
ment of tuberculosis. It was hypothesized that 
the fantasy measures would predict best those 
behaviors least subject to camouflage by the 
subjects. The pattern of results was signifi- 
cantly in the predicted direction. 


Received June 1, 1956. 


References 


1. Bach, G. R. Young children’s play fantasies. Psy- 
chol. Monogr., 1945, 59, No. 2 (Whole No. 
272). 


2. Fairweather, G. W., Moran, L. J., & Morton, 
R. B. Efficiency of attitudes, fantasies, and 
life history data in predicting observed be- 
havior. J. consult. Psychol., 1956, 20, 58. 

3. Fisher, S., & Cleveland, S. E. Body image boun- 
daries and style of life. J. abnorm. soc. Psy- 
chol., 1956, 52, 373-379. 

4. Kagan, J., & Mussen, P. H. Dependency themes 
on the TAT and group conformity. J. con- 
sult. Psychol, 1956, 20, 29-32. 

5. Korner, A. F. Some aspects of hostility in young 
children. New York: Grune & Stratton, 1949. 

6. Moran, L. J., Fairweather, G. W., Fisher, S., & 
Morton, R. B. Psychological concomitants to 
rate of recovery from tuberculosis. J. consult. 
Psychol., 1956, 20, 199-203 

. Mussen, P. H., & Naylor, H. K. The relationships 
between overt and fantasy aggression. J. ab- 
norm. soc. Psychol., 1954, 49, 235-240. 

8. Sanford, R. N., Adkins, M. M., Miller, R. B., 
et al. Physique, personality and scholarship: 
A cooperative study of school children. 
Monogr. Soc. Res. Child Developm., 1943, 8, 
No. 1. 


~s 


9. Tomkins, S. S. The present status of the Thematic 
Apperception Test. Amer. J. Orthopsychiat., 
1949, 19, 358-362. 




















mam = & . 


=m a> 





a A 


Ce ee er a ars 


—E 





Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Validities of Abbreviated WAIS Scales’ 


Eileen Maxwell * 


Fordham University 


Much discussion has been generated in re- 
cent years over the use and validity of short 
intelligence tests whether they be specially 
constructed scales or abridgments of those al- 
ready in use. Proponents offer numerous rea- 
sons for employing such scales, highlighting 
their utility for the use of courts, schools, in- 
stitutions, and military centers in the rapid 
estimation of intellectual level. Critics have 
attacked this practice, stressing the less satis- 
factory reliability and validity of brief tests. 

Various combinations of selected subtests 
of the Wechsler-Bellevue Intelligence Scale 
(WB) have been recommended by Cotzin 
and Gallagher (2, 3), Cummings, MacPhee, 
and Wright (4), Geil (6), Gurvitz (7), Hil- 
den, Taylor, and DuBois (9), Kreigman and 
Hansen (11), Patterson (13, 14), Rabin 
(15), Springer (17), etc. The worth of their 
shortened scales has been questioned by Mc- 
Nemar (12) who pointed out that the sam- 
ples used in these studies were far from those 
of normal populations, being either too homo- 
geneous or too heterogeneous. 

McNemar asserted that the validity of an 
abbreviated scale, as measured by the correla- 
tion between the sum of the subtest scaled 
scores and the Full Scale Score, should be 
based on a sample representative of a normal 
population and that, when WB subtests were 
used, validity coefficients should be computed 
from the group data obtained from the stand- 


1 This article is based on sections of a dissertation 
submitted in partial fulfillment of the requirements 
for the degree of Master of Arts at Fordham Uni- 
versity. It is published through the kind permission 
of Rev. Joseph J. Keegan, S.J., Chairman of the De- 
partment of Psychology. The author wishes to ac- 
knowledge her indebtedness to Dr. Dorothea Mc- 
Carthy for her constant encouragement and advice 
during the preparation of the dissertation. 

2Now with The Psychological Corporation. 


ardization of the WB. Since the raw data 
necessary for these correlations were not avail- 
able, McNemar devised a formula utilizing 
the intercorrelations of subtests. Through ap- 
plication of this formula, he ascertained the 
ten best teams of two, three, four, and five 
subtests of the WB. 

A revision of this scale appeared in 1955 
as the Wechsler Adult Intelligence Scale 
(WAIS). Composed of six verbal and five 
performance tests, the WAIS yields Verbal, 
Performance, and Full Scale Scores, as did 
the WB. Employing data gathered in the 
WAIS standardization, Doppelt (5) was the 
first to publish a study of the effectiveness of 
an abbreviated WAIS scale in estimating Full 
Scale Score. He arbitrarily decided upon a 
four-part scale and chose the two verbal and 
the two performance subtests which corre- 
lated most highly with the Verbal and Per- 
formance Scales, respectively. The resultant 
scale consisted of the Arithmetic, Vocabulary, 
Block Design, and Picture Arrangement sub- 
tests. In all seven age groups used in the 
Doppelt study, the correlation coefficients be- 
tween the sum of the scaled scores of the four 
subtests and the Full Scale Score were .95 or 
.96. Regression equations for predicting the 
Full Scale Scores for each age group were 
presented. 


Procedure 


The present study proposed to discover 
through the use of McNemar’s formula the 
best abbreviated scales of two, three, four, 
and five WAIS subtests. The investigation is 
based on the performance of the 300 persons 
in the 25-34 year age group used in the WAIS 
standardization. The required statistics for 
this group are available in the published 
manual (19, Table 8). 


121 





122 Eileen Maxwell 


The general formula for the correlation be- 
tween the regular and abbreviated scales is: 


Dok + LV rajono; 


Table 2 


Correlation Coefficients Between the Eleven Best Duads 
of WAIS Subtests and Full-Scale Score 








re N02 + Drow; Vo + WDrerseon 
where the subscripts i, j identify subtests in 
the regular or Full Scale and g, # identify the 
subtests comprising the abbreviated scale. 

McNemar estimated the error involved in 
this procedure to be .005 or less. Since the o 
of each subtest is three scale units and the 
Sry is 29.27 (19, Table 8), a special formula 
may be developed for each number of sub- 
tests being used in the abbreviated scale. 
Table 1 lists the possible number of abbrevi- 
ated scales using two, three, four, or five sub- 
tests and the special formula which applies in 
each instance. 

The correlation coefficient, 7, between the 
sum of the scaled scores of & subtests and the 
Full-Scale Score has been computed for each 
of the 1,012 combinations, but only the ten 
best scales are presented for each length. 
Where several scales have equal coefficients, 
however, the tables are expanded to include 
all of them. Following the system of Rabin 
and Guertin (16), a verbal subtest is referred 
to by its first letter—as I-Information, C- 
Comprehension, A-Arithmetic, S-Similarities, 
D-Digit Span, and V-Vocabulary; while the 
performance subtests are indicated by dou- 


Table 1 


Number of Possible Combinations of k Subtests and 
Applicable Formulae for the Correlation 
of Abbreviated Scales with 
Full-Scale Score 




















Number of 
Number (2) possible Formulae 
of subtests combinations (r =) 
2 55 2+ 22K; 
11.787 Vi0+ Dra 
3 165 3+22 Thi 
4 330 4+ zz Tj 
11.787 V2.0 + Dro 
5 462 5 + >» Zz Tj 





11.787 V2.5 + Er. 














Duads r 

V BD 924 
I BD 917 
I PA 917 
VPC 914 
IV 912 
VPA .909 
I PC .906 
Is .903 
I OA .900 
VOA 898 
S PA 898 


ble initials—as DS-Digit Symbol, PC-Picture 
Completion, BD-Block Design, PA-Picture 
Arrangement, and OA-Object Assembly. 


Discussion of Results 


Table 2 shows the best 11 of the 55 duads 
whose range of r extends from .92 to .78. 
This may be contrasted with McNemar’s 
range of .88 to .74 for the correlations be- 
tween duad scores and Full-Scale Score for 
the WB. 

In Table 3 are presented the correlations 
for the ten best triads of WAIS subtests. The 
rs for the 130 combinations range from .95 to 
.83, while those of McNemar extend from .91 
to .83. 

The twelve best tetrads are shown in 
Table 4. The range in r for these 330 pos- 
sible combinations extends from .96 to .90, 


Table 3 


Correlation Coefficients Between the Ten Best Triads 
of WAIS Subtests and Full-Scale Score 














Triads r 
I VBD 952 
I VOA 947 
IS PA 944 
IS BD .942 
IS OA .942 
I VPA .942 
VPC PA 941 
VBD PA 941 
VPC BD .940 
I BD PA .940 





saa ts mI 


Se ee 











I 


ee 





Validities of Abbreviated WAIS Scales 123 


Table 4 


Correlation Coefficients Between the Twelve Best 
Tetrads of WAIS Subtests and Full-Scale Score 














Tetrads r 
I VBDPA 964 
I S BD PA .963 
I S PA OA 961 
S VPC BD 960 
I VPA OA .960 
I CDS BD 959 
I S PC PA 959 
I VPC BD 959 
A VPC PA 959 
A VBD PA .959 
S VBD PA .959 
DVPC PA 959 


while the range for the WB tetrads was re- 
ported to be .93 to .90. Doppelt (5) records 
a correlation of .954 for his scale, AV BD PA, 
a value in substantial agreement with the r 
of .959 as determined by this study. 

Table 5 contains the ten best pentads of 
WAIS subtests and their respective correla- 
tion coefficients. The range of the 462 WAIS 
pentads is .97 to .91, while the correlations 
for WB pentads extend from .94 to .89. As 
expected, the correlation between the best ab- 
breviated scale and the full WAIS increases 
as the number of subtests in the abbreviated 
scale rises from one to five. The actual in- 
crease is from .87 to .97. 

Table 6 presents McNemar’s best abbrevi- 
ated scales, their correlations with Full-Scale 
Score as reported by him for the WB, and the 
corresponding coefficients as determined by 


Table 5 


Correlation Coefficients Between the Ten Best Pentads 
of WAIS Subtests and Full-Scale Score 











Pentads r 
IS VPA OA 972 
IS VBDPA 972 
AS VPA OA 971 
IS VPC BD 971 
I AS PA OA 971 
I DV PC BD 970 
I DVBDOA 970 
I V DSBD PA 970 
I V PCBD PA 970 
IS DSBD PA 970 





Table 6 


Correlation Coefficients Between WB Abbreviated 
Scales and Full-Scale Score as Reported 
by McNemar and Corresponding 
WAIS Coefficients 


r with 
rwith WAITS total 
Scale Abbreviated WB total (Present 
length scales ' (McNemar) _ study) 
Duads I BD 884 917 
C BD 881 885 
S BD 880 885 
S DS 864 B55 
CDS 853 870 
S PC 851 889 
e B44 .903 
DS BD 844 852 
A BD 341 855 
A DS 840 849 
Triads C DS BD 912 927 
IS BD 912 942 
S DS BD 911 915 
I C BD 910 939 
S DS PC .907 917 
S DBD .906 913 
CA BD .903 923 
I DS BD .903 935 
CS BD .902 .926 
S DPC 898 917 
Tetrads CA DS BD 932 953 
CS DS BD .929 949 
IS DSBD 928 954 
CA DS PC .928 954 
AS DS PC .928 .946 
I C DSBD .928 .959 
S DPC BD 927 .944 
S DS PC BD 927 .941 
C DS PC BD 926 .947 
AS DS BD 926 942 
Pentads C ADS PC BD 944 .966 
CS DS PC BD 942 964 
AS DS PC BD 942 .961 
I S DS PC BD .942 965 
IS DPC BD 941 .966 
I CDS PC BD .940 969 
CADS BD PA .940 967 
CS DPC BD 939 964 
CADS PC PA 939 969 
CAS DS BD 939 966 


this study for the WAIS. With the exception 
of one duad, all WAIS combinations yield 
larger correlations than corresponding WB 
scales. 

Abbreviated WB scales suggested by vari- 





124 Eileen Maxwell 


ous investigators are contained in Table 7. 
The coefficients of corresponding WAIS scales, 
as determined in this study, are shown for 
comparison between the WB scales reported 
for clinical populations and the WAIS scales 


Table 7 


Correlation Cocfficients Between WB Abbreviated 
Scales and Full-Scale Score as Reported for 
Clinical Populations and Correspond- 

ing WAIS Coefficients 














r with 
WAIS 
r with total 
WB (Present 
Scale Investigator total study) 
DPA Gurvitz (7) 90 917 
Patterson (13) 81 917 
Cotzin and 
Gallagher (3) 75 917 
CA Patterson (13) 85 870 
Herring (8) —* 870 
V BD Hilden (9) 91 .929 
V PC Hilden 89 914 
SV Hilden 88 885 
VPA Hilden . 88 .909 
CAS Rabin (15) 80 912 
Hunt (10) .78 912 
Springer (17) 92 912 
CAPA Patterson (13) 89 .922 
Herring (8) _* .922 
CVDS Patterson (13) .93 914 
S VBD Hilden (9) .937 .938 
I SV BD Kriegman and 
Hansen (11) 91 957 
CS DBD Patterson (13) 93 .947 
V CPC BD Patterson (13) .96 951 
Herring (8) —* 951 
I DS PC PA Geil (6) 952 .949 
CS PABD Cotzin and 
Gallagher (2) .936 .950 
CADSPCBD Herring (8) — .966 





*No coefficients were reported by Herring, who merely 
recommended these scales. 


based on a normative population; the superior 
validities of the WAIS scales are again evi- 
dent. 

Examination of the intertest correlations, 
as presented in the manuals (18, Table 41; 
19, Table 8) for the pertinent age group, re- 
veals that of the 45 correlations among the 
ten WB subtests all but seven are lower than 
those in the corresponding WAIS data. Since 
these statistics are elements used in determin- 
ing correlation coefficients, they have obvi- 
ously contributed to the larger rs of the 
WAIS scales. 

The influence of the inclusion of the Vo- 
cabulary subtest in the WAIS is also seen to 
be significant. With few exceptions the cor- 
relation of each subtest with Vocabulary is 
higher than with the remaining nine subtests. 
The validities of all abbreviated scales are 
affected since =37,; and =r, components of 
the formula for correlation, include intercor- 
relations with the Vocabulary test. While this 
tends to result in higher correlations for all 
abbreviated scales, those containing Vocabu- 
lary reflect more directly the fact of higher 
intertest correlations. 

Data for the WAIS reveal Vocabulary as 
the most reliable subtest, having the smallest 
standard error of measurement. It is to be re- 
gretted that reliability coefficients for the WB 
subtests were not published with the other 
standardization data, since a consideration of 
changes in reliability between WAIS and WB 
subtests might have proved illuminating to 
the discussion of abbreviated scale validity. 
Probably the major reason for the higher rs 
in this study is the higher reliability of the 
WAIS subtests, and undoubtedly some of the 
changes in composition of the best abbrevi- 
ated WAIS and WB scales result from shifts 
in relative reliability of subtests. 

The best abbreviated scales of WAIS ver- 
bal subtests are presented in Table 8. The in- 
crease in r with increase in number of subtests 
is evident in these verbal scales as in the ab- 
breviated scales presented in Tables 2 through 
5. The coefficients approach the limiting value 
of .95, the figure reported by Wechsler as the 
correlation between the Verbal Scale and the 
full WAIS. As a check on procedure, the co- 
efficient for the full Verbal Scale, i.e., all six 
verbal subtests, was computed by a special 














Validities of Abbreviated WAIS Scales 


form of McNemar’s formula and found to be 
954. 

Table 9 presents the best abbreviated per- 
formance scales according to the present cri- 
terion. It may be noted that these yield 
correlations with the Full-Scale Score consid- 
erably lower than the verbal scales. The r for 
the combination of all five performance sub- 
tests was .928, as compared to .92, Wechsler’s 
value for the correlation between the Perform- 
ance Scale and the full WAITS. 

Both Tables 8 and 9 have been included to 
serve the particular needs of an examiner. 


Table 8 


Number of Possible Combinations and Correlations 
Between Full-Scale Score and Best Abbreviated 
Scales of One, Two, Three, Four, Five, 
and Six WAIS Verbal Subtests 





Possible 
combinations 


Abbreviated 
scale r 


Scale length 





Single subtest 6 87* 
86 
79 


fi 


73 


arf 


oran 


Duads 15 912 
.903 
893 
892 


885 


n> See 
<<a%< 


Triads 20 .924 
.922 
919 
918 


918 


OQ = = et 
>OaAN> 
<>po<< 


Tetrads 15 .936 
.936 
.936 
.936 
.934 


.934 


ee et 
NMAQFAAN 
VONN>> 
<<<o<% 


Pentads 6 .946 
946 
944 
944 
944 


943 


all @ henlaeanen 
> FOO] 
NNN SS >} 
who how hho k 
<<4<<<0 


Verbal scale 1 


ICASDV 954 


* The rs of sage subtests with Full-Scale Score are taken 
from the WAIS Manual (19, Table 8). 





125 


Table 9 


Number of Possible Combinations and Correlations 
Between Full-Scale Score and Best Abbreviated 
Scales of One, Two, Three, Four, and Five 
WAIS Performance Subtests 


Possible Abbreviated 
Scale length combinations scale r 

Single subtest 5 PC .78* 

PA 77 

BD 76 

DS 71 

OA 65 
Duads 10 PC PA 878 
DS PC 864 

BD PA 862 
PC BD 853 
DS PA 853 
Triads 10 DS PC PA 914 
DS BD PA .906 
DS PC BD 904 
PC BD PA .904 
PC BD OA 895 
Tetrads 5 DS PC BD PA .935 
DS PC PA OA 918 
DS BD PA OA 905 
DS PC BD OA .902 
PC BD PA OA .900 
Pentads 1 DS PC BD PA OA 928 
* The rs of single subtests with Full-Scale Score are taken 


from the WAIS Manual (19, Table 8). 


The selection of an abbreviated scale depends 
upon the situation in which it must be ad- 
ministered and the purpose for which it is be- 
ing given. After the criteria of time, clinical 
intent, and required accuracy have been es- 
tablished, these tables may be examined to 
find the appropriate test or abbreviated scale. 


Summary and Conclusions 


The validities of all possible abbreviated 
WAIS scales of two, three, four, and five sub- 
tests were determined in this investigation. 
The coefficient of correlation between the full 
WAIS and the sum of the particular subtest 
scores was computed by a variation of Mc- 
Nemar’s formula and this r was considered a 
measure of the validity of the abbreviated 
scale. The reference group for this study was 
the 300 men and women in the 25-34 year 
age group used in the WAIS standardization. 











126 


On the basis of this study, the following 
conclusions were reached: 


1. The accuracy of an abbreviated scale in 
estimating Full-Scale Score increases as the 
number of subtests in the scale increases. An 
optimum point is reached, however, at which 
an increase in scale length brings about but a 
slight increase in r. Combinations composed 
exclusively of verbal tests or of performance 
tests have lower correlations than do abbrevi- 
ated scales with both types of subtests. Short 
verbal scales are superior to the performance 
scales in estimating mental level as measured 
by the whole scale. 

2. Abbreviated WAIS scales have higher 
correlations with the Full Scale than do the 
brief WB scales recommended for clinical or 
normal populations. 

3. The best abbreviated WAIS scales differ 
in composition from the best abbreviated WB 
scales. The content changes within subtests, 
their increased reliabilities and intercorrela- 
tions, and the addition of Vocabulary as a 
formal part of the WAIS scale are factors 
operant in these changes. 


Abbreviated scales reduce the time neces- 
sary for estimation of intellectual ability, but 
they also reduce the effectiveness of the in- 
dividual test. As Anastasi (1) points out, the 
use of an abbreviated scale results in the loss 
of qualitative observations which administra- 
tion of the complete scale affords. This loss, 
and the decreased accuracy in reporting the 
mental level of a subject, are important fac- 
tors to be considered before deciding upon the 
use of an abbreviated scale. 


Received January 14, 1957. 
Early Publication. 


References 


1. Anastasi, Anne, Psychological testing. New York: 
Macmillan, 1954. 

2. Cotzin, M., & Gallagher, J. Validity of short 
forms of the Wechsler-Bellevue Scale for men- 


Eileen Maxwell 


w 


10. 


15. 


16. 


18. 


tal defectives. J. consult, Psychol, 1949, 13, 
357-365. 


. Cotzin, M., & Gallagher, J. The Southbury scale: 


a valid abbreviated Wechsler-Bellevue for 
mental defectives. J. consult. Psychol., 1950, 
14, 358-364. 


. Cummings, S., MacPhee, M., & Wright, H. A 


rapid method of estimating the IQs of sub- 
normal white adults. J. Psychol., 1946, 21, 
81-89. 


. Doppelt, J. Estimating the full scale score on the 


Wechsler Adult Intelligence Scale from scores 
on four subtests. J. consult. Psychol, 1956, 
20, 63-66. 


. Geil, G. A clinically useful abbreviated Wechsler- 


Bellevue Scale. J. Psychol., 1945, 20, 101-108. 


. Gurvitz, M. An alternate short form of the 


Wechsler-Bellevue. 
1945, 15, 727-732. 


Amer. J. Orthopsychiat., 


. Herring, G. An evaluation of published short 


forms of the Wechsler-Bellevue Scale. J. con- 
sult, Psychol., 1952, 16, 119-123. 


. Hilden, A., Taylor, J., & DuBois, P. Empirical 


evaluation of short Wechsler-Bellevue Scales 
J. clin. Psychol. 1952, 8, 323-331. 

Hunt, W., Klebanoff, S., Mensee, I., & Williams, 
M. The validity of some abbreviated intelli- 
gence scales. J. consult, Psychol., 1948, 12, 
48-52. 


. Kriegman, G., & Hansen, F. VIBS: a short form 


of the Wechsler-Bellevue Intelligence Scale. J 
clin. Psychol., 1947, 3, 209-216. 


. McNemar, Q. On abbreviated Wechsler-Bellevue 


scales. J. consult. Psychol., 1950, 14, 79-81. 


. Patterson, G. A comparison of various “short 


forms” of the Wechsler-Bellevue Scale. J. con- 
sult. Psychol., 1946, 10, 260-267. 

Patterson, G. Further study of two short forms 
of Wechsler-Bellevue. J. consult. Psychol, 
1948, 12, 147-152. 

Rabin, A. A short form of Wechsler-Bellevue 
test. J. appl. Psychol., 1943, 27, 320-324. 

Rabin, A. & Guertin, W. Research with the 
Wechsler-Bellevue test: 1945-50. Psychol. Bull, 
1951, 48, 211-248. 


. Springer, N. A short form of Wechsler-Bellevue 


intelligence test as applied to naval personnel. 
Amer. J. Orthopsychiat., 1940, 16, 341-344. 

Wechsler, D. The measurement of adult intelli- 
gence. (3rd ed.) Baltimore: Williams & Wil- 
kins, 1944. 


. Wechsler, D. Manual for the Wechsler Adult In- 


telligence Scale. New York: Psychological 


Corp., 1955. 





‘reins 


A> sah ~' Wp Pa 








— ae oa 


oor <a om wee mm SS ase 


if 








OP A CE item 5.) 


aie 








Journal 7 Coney Psychology 
Vol. 21, No. 2, 195 


The Effect of Distrust on Some Aspects of Intelligence 
Test Behavior 


Gerald Wiener ** 


The Seton Psychiatric Institute 


This experiment is concerned with the ef- 
fect of trustful compared to distrustful atti- 
tudes upon the Picture Completion (PC) and 
the Similarities (Sim) subtests of the Wechs- 
ler Adult Intelligence Scale (WAIS) (4). The 
hypotheses tested were: 

1. Those Ss who are distrustful respond to 
the instructions of the PC and Sim subtests 
with a task-inappropriate distrustful attitude. 
They tend to think, “There is nothing miss- 
ing in that picture,” or “There is no simi- 
larity between those things.” Such inappro- 
priate responses impair performances on these 
tests and are reflected in comments made 
during the test. 

2. Instructions given to Ss designed to 
make them distrustful of the experimental 
situation have a similar impairing effect on 
PC and Sim performances because of the 
task-interfering attitudes aroused. Comments 
indicating distrust are increased by such in- 
structions. 

3. No predictions are made concerning the 
interaction of Ss’ predisposition and the ex- 
perimental instructions. Those Ss who are in- 
clined to be distrustful may be made more so 
if the situation conforms to their expecta- 
tions. Trustful Ss may be made distrustful. 
On the other hand, it is possible they will not 
perceive the experimental situation as dis- 
trustful and will not react to it as such. 


Procedure 


Four groups of 10 Ss each were used. Two 
groups were chosen from Ss rated as highly 


1 The author is grateful to The Seton Psychiatric 
Institute and to Doctor Leo H. Bartemeier for mak- 
ing this research possible. 

2 Now at Rosewood State Training School, Owings 
Mills, Md. 


distrustful (HD groups). The remaining two 
groups were rated low in distrust (LD 
groups). One HD and one LD group were 
given experimental instructions designed to 
engender a distrustful attitude (IN groups) 
One HD and one LD group were not given 
distrust producing instructions (NIN groups) 


The Ss were rated for distrust according to thel 
responses to a 24-item inventory which 
ministered to 148 female student nurses, Eleven of 
the items were culled from the Minnesota Multi 
phasic Personality Inventory (MMPI) scale for 
paranola (1). These items were changed slightly in 
order to conform to the Ss’ background. The re 
maining 15 filler items were from the ZL scale of the 
MMPI. The ZL score variable was controlled so that 
the four groups did not differ with respect to this 
variable, The items were worded so that they per 
mitted a five-category response ranging from entirely 
trustful to entirely distrustful. Each critical item 
was then scored from 1 to 5 and the scores totaled 
The highest score was taken to be indicative of the 
most distrustful attitude. The scores ranged from 14 
to 45. The mean score was 27.0 and the SD was 
46. LD Ss had scores of 21 or less; HD Ss had 
scores of 33 or higher. All Ss used were within the 
upper or lower 15% of the distribution, The inven 
tory was administered independently of the experi 
ment and two weeks prior to it. Two of the 40 Ss 
asked if a relationship existed between the experi 
ment and the questionnaire 


was ad 


The 20 NIN Ss were given the WAIS Vo 
cabulary (V), PC, and Sim subtests in that 
order, At the start of the experiment they 
were told that the examiner was comparing 
student nurses with other groups. The 20 IN 
Ss were given instructions designed to induce 
a distrustful attitude. They were presented 
with the V subtest in the same way that the 
NIN groups were, However, at the conclusion 
of the V subtest, the examiner announced that 
he had lied and that he was conducting the 


127 





128 


Gerald 


experiment for other purposes about which 
he would later inform the Ss. An impossible- 
to-solve block design problem was then pre- 
sented. After 90 seconds of coping with this 
problem, each S was told that she was once 
again deceived and that the problem could 
not be solved. The PC and Sim subtests were 
then administered. All tests were presented in 
accordance with the WAIS manual instruc- 
tions. Verbatim responses and spontaneous 
comments were recorded. 

The dependent variables consisted of the 
discrepancy of the scaled PC — V and Sim — 
V scores appropriate to each S’s age. Since 
the scaled scores are z scores and are based 
on each S’s age group, age and general level 
of intelligence (as is judged by the V test) 
were controlled. If a 20-year-old S$ had raw 
scores of 55, 13, and 16 on the V, PC, and 
Sim subtests, her weighted scores on each 
of these subtests were 12, 9, and 11 respec- 
tively. Her PC — V discrepancy score would 
have been — 3 and the Sim — V discrepancy 
score would have been — 1. The V scale was 
chosen as a base for measuring impairment, 
because it is highly correlated with the total 
WAIS score. A priori the V subtest would not 
seem to be sensitive to a distrustful attitude. 
PC and Sim subtests were chosen as depend- 
ent variables, because they were thought to 
be more sensitive to distrustful attitudes. The 
instructions used in administering these tests 
would allow for disbelief. 

The responses and spontaneous comments 
were also examined for expressions of disbe- 
lief. For the PC, these included expressions 
such as “Nothing is missing from this pic- 
ture,” “Is there always something missing?,” 
and “Nothing that I can see .. . ,” etc. Ex- 
pressions of disbelief on the Sim included: 
“They are not alike,” “. . . Opposite... ,” 


Table 1 


Mean Impairment Scores of Groups 

















Group 
LD LD HD HD 
Tests compared NIN IN NIN IN 
PC-—V —2.2 —-15 -32 —3.1 
Sim—V +2-—-5 —12 ~—13 
(PC—V) + (Sim—V) —2.0 —-20 -—-44 —44 





Wiener 


Table:2 


Mean Number of Comments Indicating Distrust 








Group 
LD LD HD HD 
Test NIN IN NIN IN 
PC § 18 1.1 2.1 
Sim A 4 1.2 9 
PC + Sim 9 2.7 2.3 3.0 


“Praise and punishment .. . alike did you 
say?,” etc. The number of disbelief state- 
ments for each S for the PC and Sim subtests 
was tabulated. 


Results 


Data concerning impairment as a function 
of Ss’ distrustful attitudes and experimental 
conditions appear in Table 1. HD Ss com- 
pared to LD Ss are significantly impaired on 
the PC (F = 5.04, 1 and 36 df, p= .02). 
Impairment on the Sim was also significantly 
greater for HD than LD Ss (F = 4.96, 1 and 
36 df, p= .02). When impairment on both 
the PC and Sim is combined, it is seen that 
HD Ss compared to LD Ss are more signifi- 
cantly impaired (F = 9.72, 1 and 36 df, p 
< .01). Instructions designed to create a dis- 
trustful attitude apparently do not produce a 
significant impairment for either HD or LD 
groups or for both combined. Nor do instruc- 
tions interact significantly with Ss’ predispo- 
sitions to affect performance. 

Table 2 contains data relevant to com- 
ments indicating distrust. When the PC alone 
is considered, HD Ss compared to LD Ss do 
not tend to make more remarks indicative of 
distrust. The IN groups make more com- 
ments indicative of a distrustful attitude (F 
= 7.57, 1 and 36 df, p < .01). The interac- 
tion between experimental conditions and Ss’ 
predispositions was not significant. Consider- 
ing the Sim, the only finding attaining a sig- 
nificant level was that HD Ss compared to 
LD Ss produce more spontaneous distrustful 
comments (F = 3.20, 1 and 36 df, p= .05). 
Combining the PC and Sim comments, it was 
found that HD Ss verbalize more distrustful 


8 F values using 1 df were converted into ¢ values. 
All probability estimates derived from such values 
are based on one-tailed tests of significance. 





abba Me. te 


ee ee eee NEB, 1 


oh ti ai aiken ae nie ane 























comments than LD Ss (F = 4.41, 1 and 36 
df, p = .03). The IN Ss compared to NIN 
Ss respond with more comments indicating 
distrust (F = 9.53, 1 and 36 df, p< .01). 
The interaction between instructions and pre- 
dispositions was not significant with the PC 
and Sim combined. 

In summary, evidence was provided to 
show that comments associated with distrust 
tend to occur as a function of Ss’ predisposi- 
tions as well as experimentally induced atti- 
tudes. Intellectual impairment is more sig- 
nificantly manifested by Ss who tend to be 
distrustful. Experimental conditions are not 
associated with intellectual impairment, al- 
though they lead to comments which indicate 
distrust. 

It was hypothesized that a distrustful atti- 
tude stimulates an interfering response be- 
cause it prevents the task-appropriate re- 
sponse from being made. People who say, 
“There is nothing missing in that picture!” 
are responding to internal needs rather than 
to the testing situation. If this is the case, 
there should be a positive correlation be- 
tween interfering verbal responses and im- 
paired scores. Of 40 Ss tested, nine Ss made 
five or more remarks classified as distrustful 
on both the PC and Sim. Their combined PC 
and Sim impairment was 4.4 units. Sixteen Ss 
produced none or only one interfering com- 
ment on both the PC and Sim. These Ss were 
impaired by 2.7 units. The mean difference 
between these Ss was 1.75 (¢ = 1.62, 24 df, 
p = .06). Thus there is some indication that 
interfering responses, as indicated by the 
emergence of verbal comments, are followed 
by an impaired intellectual performance. 


Discussion 


The results indicate that Ss’ distrustful 
predispositions are correlated with impaired 
performances on the WAIS PC and Sim sub- 
tests. Distrustful Ss will verbalize their atti- 
tude of not accepting the test instructions as 
true. Suggestive evidence was obtained to 
show that a distrustful attitude stimulates 
responses which interfere with task-appropri- 
ate behavior. In turn, intellectual impairment 
is in part a function of such task-inappropri- 
ate responses. 

Experimental instructions designed to in- 








Effect of Distrust on Test Behavior 129 








duce a distrustful attitude were not effective 
in inducing impaired performances, but were 
associated with spontaneous comments in- 
dicative of distrust. This datum indicates that 
the test instructions were in some measure 
effective in inducing the desired attitude. It 
is unclear why experimental instructions pro- 
duce task-interfering responses but do not 
impair intellectual performance. There are 
gradations in the strength of a distrustful 
attitude. Verbal comments might emerge 
rapidly as a result of instructions. However, 
Ss may retain their ability to recover and as- 
sert a task-appropriate attitude. Possibly the 
instructions produced some distrust, but not 
enough to hinder intellectual functioning. 

The experiment generally indicates that 
character traits may affect WAIS perform- 
ance. It is possible that paranoid conditions 
which generally are highly associated with a 
distrustful attitude would be revealed by low- 
ered WAIS PC and Sim scores and that such 
scatter would constitute a diagnostic sign. 
Schofield’s review of the literature (3) does 
not mention such findings. Numerous person- 
ality characteristics make up a paranoid state. 
Some of these traits may tend to enhance 
rather than detract from a given intellectual 
skill. For example, the perceptual alertness of 
a paranoid may make him very sensitive to 
missing details within pictures (2, p. 80) so 
as to neutralize the impairing effect of his 
distrustful attitude. 


Summary 


It was hypothesized that distrustful atti- 
tudes are reflected in intellectual behavior as 
measured by impaired Wechsler Adult Intelli- 
gence Scale Picture Completion and Similari- 
ties subtest scores. A distrustful attitude is 
hypothesized to be a stimulus for an interfer- 
ing response which prevents task-appropriate 
responses from being made. 

Four groups of 10 Ss each were tested. Two 
groups were considered highly distrustful and 
two groups low in distrust. Distrustful atti- 
tudes were measured by a questionnaire. One 
high and one low distrustful group were given 
experimental instructions designed to induce 
a feeling of distrust toward the experimental 
situation. The remaining groups were given 
neutral instructions. 











130 Gerald Wiener 


The Ss prone:to be distrustful are signifi- 
cantly more impaired in their intellectual be- 
havior. These Ss also tend to make more 
spontaneous comments indicative of a dis- 
trustful attitude. Experimental conditions de- 
signed to encourage distrust were not effective 
in impairing performance but were associated 
with spontaneous comments indicative of dis- 
trust. 


Received June 28, 1956. 


References 


1. Hathaway, S. R., & McKinley, J. C. Minnesota 
Multiphasic Personality Inventory. New York: 
Psychological Corp., 1951 

2. Schafer, R. The clinical application of psycho- 
logical tests. New York: International Uni- 
versities Press, 1948. 

3. Schofield, W. Critique of scatter and profile analy- 
sis of psychometric data. J. clin. Psychol, 
1952, 8, 16-22. 

4. Wechsler, D. Wechsler Adult Intelligence Scale. 
New York: Psychological Corp., 1955. 


ence ME 


a. 


> tee Got ah a 





“4 
4 
i 
j 


Fin ck om 


oe 


ars 


i ladledh Sracits 


Wiad ars. Bs 








Journal of C: Psycholo 
Vol, 21, oe 2, 195 a 









Reliability (Internal Consistency) of the Wechsler 
Memory Scale and Correlation with the 
Wechsler-Bellevue Intelligence Scale 


and Robert Toal 


Veterans Administration Hospital, Bronx, New York 


Although the Wechsler Memory Scale 
(WMS) was published in 1945 (11), no in- 
formation is available concerning either its 
reliability or its validity. The lack of these 
vital data, however, has not precluded ex- 
travagant statements about the diagnostic 
efficacy of the scale. For example, Kogan in 
a review of the test states: “Thus, [by means 
of WMS] it is possible to distinguish ac- 
curately between the kind of memory impair- 
ment which is merely one aspect of gen- 
eralized mental inefficiency and that which 
represents a specific decrement in memory 
function” (8, p. 399). In the same review 
Kogan also indicates that differences among 
WMS subtest scores will be useful in differ- 
ential diagnosis. 

In the three investigations of the scale’s 
diagnostic potency which have been reported 
(1, 4, 5), the test failed to distinguish groups 
having brain damage from control groups. 
Cohen’s study was unusually adequate, be- 
cause it was free of the sampling defect which 
has seriously reduced the value of many stud- 
ies of the diagnostic usefulness of psychologi- 
cal tests. That is, his experimental group was 
not composed entirely of grossly brain-dam- 
aged individuals but was a representative 
sample and, therefore, heterogeneous with re- 
gard to the degree of brain damage. Cohen 
concluded: “. .. Wechsler Memory Scale 
subtest scores and their combinations do not 


1 From the Clinical Psychology Section, Neuropsy- 
chiatric Service, Bronx VA Hospital. The writer 
wishes to express her appreciation to Drs. H. L. 
Flowers, Chief, Neuropsychiatric Service, and R. S. 
Morrow, Chief, Clinical Psychology Section, for 
their encouragement and sustained support. 


131 


differentiate among psychoneurotic, organic, 
and schizophrenic groups” (1, p. 375). 

The demonstration that the scale is lacking 
in diagnostic efficacy, however, does not nec- 
essarily indicate that it is lacking in reli- 
ability nor does it negate its usefulness for 
purposes other than diagnosis. Diagnosis is 
hardly the sole raison d’étre for psychologi- 
cal tests, and it is possible that memory tests 
could make valuable contributions to two 
broad areas of activity: guidance and re- 
search. For example, memory tests might be 
of assistance in helping the individual who 
has sustained brain damage reavh decisions 
about his future mode of life. They also have 
potential value for investigation of brain func- 
tion (e.g., see the investigation of Morrow 
and Mark [9] of the effects of gross brain 
damage on memory test performance). The 
need for the guidance and research activities 
described is acute. The crucial question inso- 
far as the WMS is concerned is whether or 
not the scale can yield meaningful data. The 
study reported here is an investigation of 
two aspects of WMS: its reliability and its 
correlation with the Wechsler-Bellevue Intelli- 
gence Scale (W-B). The latter part of the 
study has an indirect bearing on the ques- 
tion of validity, since it asks the question: to 
what degree is the WMS measuring some- 
thing that is not measured in the intelligence 
scale? 


Procedure 


The WMS and the W-B, Forms I and II, 
are described in detail in their respective 
manuals (10, 11, 12). Although there are two 


132 


Julia 


Table 1 


Range, Mean, and Standard Deviation of Age, Educa 
* tion, Wechsler-Bellevue IQ, and Wechsler 
Memory Scale Scores of 150 Males 


Variable Range Mean SD 

Age 25-34 29.11 2.77 
Education 6-18 10.81 2.56 
Wechsler-Bellevue 

Full Scale IQ 78-136 107.87 12.18 

Full Scale Score 57-148 105.49 17.76 

Verbal Scale Score 26-79 52.41 10.90 

Performance Scale Score 30-73 53.08 9.45 


Wechsler Memory Scale 


Information and Orientation 10-11 10.75 62 
Mental Control 2-9 6.92 2.01 
Logical Memory 2-20 10.01 3.85 
Digit Span 7-15 11.65 2.07 
Visual Reproduction 1-14 9.74 2.99 
Associate Learning 7-21 15.93 3.27 
Total of Subtest Scores* 37-87 65.01 10.13 
Memory Quotient 63-143 104.95 16.98 


* The total before addition of the age correction points. 


forms of WMS, only Form I was used in this 
study. 

The subjects of the study were individuals 
tested in the Clinical Psychology Section, 
Neuropsychiatric Service, Bronx VA Hospital, 
during the period from 1946 through 1949. 
During this period the WMS and the W-B 
were given routinely by some members of the 
staff and by some of the clinical psychology 
trainees. The diagnoses of the subjects are 
varied (e.g., diabetes, phantom limb pain, 
hepatitis, etc.), but the majority, 55%, had 
behavior disorder diagnoses (neurosis, imma- 
ture personality, etc.). 

The medical records of all patients who re- 
ceived both the WMS and the W-B and were 
in the age group from 25 through 34 were ex- 
amined, and all individuals who fulfilled the 
research criteria were selected. The criteria 
were: (a) not psychotic, (b) no symptoms 
attributed to brain damage, (c) no history 
of head trauma, (d) electroencephalogram, if 
done, reported as normal, (e) no history of 
shock therapy, (f) if Negro, not educated in 
a southern state, and (g) not educated in a 
foreign country. A total of 150 individuals 
(all males) passed the research criteria. Of 





C. Hall 


the total group, 86 were given W-B, Form I, 
and 64 were given W-B, Form II. In the cor- 
relations with WMS, the scores from W-B I 
and II are pooled. 


Results 


Table 1 shows the range, mean, and stand- 
ard deviation of the age and education of the 
group as well as their WMS and W-B scores. 
Chi-square tests of the shape of the Full 
Scale IQ and the Memory Quotient (MQ) 
distributions indicate that both approximate 
a normal curve (y* of 2.3170 with 7 df, .95 
> p> .90; y® of 7.9987 with 7 df, 50 >p 
> .30). 

The internal consistency of five of the 
WMS subtests and of the total scale was 
tested by the method (coefficient alpha) de- 
scribed by Cronbach (2); the reliability co- 
efficients are shown in Table 2. Since the 
variability of the Information and Orienta- 
tion subtests is infinitesimal, analysis of these 
two subtests was not done. 

In order to secure information about the 
relationship between specific subtests, the in- 
tercorrelations of five of the WMS subtests 
with each other and with total score were 


Table 2 
Reliability of the Wechsler Memory Scale 
Coefficient 
Subtest and subtest items Variance alpha 
Information and Orientation 389 
Mental Control 4.020 383 
Counting 20-1 487 
Alphabet 1.154 
Counting by 3’s 1.353 
Logical Memory 14.816 814 
Paragraph 1/2 4.973 
Paragraph 2/2 3.812 
Digit Span 4.280 647 
Digits Forward 1.122 
Digits Backward 1.773 
Visual Reproduction 8.925 .634 
Design A 552 
Design B 1.950 
Design C-1 1.253 
Design C-2 921 
Associate Learning 10.692 368 
Easy Words/2 453 
Hard Words 8.274 
102.698 .696 


Total Score 











BF cetieiatst xh RES 


Reliability of the Wechsler Memory Scale 





Table 3 


Intercorreiations of Wechsler Memory Scale Subtests and of the Memory Scale with the 
Wechsler-Bellevue Intelligence Scale 


Logical 


Correlation coefficient 


Visual Assoc, Hard Total 
Variable memory Digits repro learn words score* 
Wechsler Memory 
Mental Control 369 A74 325 268 236 445 
Logical Memory 396 359 776 405 524 
Digits 394 342 321 527 
Visual Reproduc .298 .284 A458 
Associate Learn 473 
Hard Words 442 
Wechsler-Bellevue 
Full Scale Scoret 750 
Verbal Scoret 690 
Performance Score 536 
* With the relevant subtest eliminated 
+ With the Digit Span score excluded 
computed. Since the reliability coefficient ists, however, without any knowledge about 


(.368) of the Associate Learning Subtest in- 
dicates that the two items of the subtest are 
not at all equivalent and since logical analy- 
sis suggests that it is the Hard Words por- 
tion in which learning activity is most in- 
volved, the Hard Words score was correlated 
with the four other subtests and with the to- 
tal score. 

To secure information about the degree of 
relationship between the WMS and the W-B, 
the total WMS score (before the addition of 
the age correction score) was correlated with 
the W-B Verbal, Performance, and Full Scale 
scores. Digit Span is a subtest in both the 
W-B and WMS, and the score of this subtest 
was, therefore, eliminated from the W-B 
scores. Since in all but nine of the 150 cases 
all six of the W-B Verbal subtests were given, 
the elimination of Digit Span had little effect 
upon the prorated Verbal scores (with Digit 
Span excluded, mean and standard deviation 
of Verbal scores are 53.69 and 10.97, respec- 
tively). All of the correlation coefficients are 
shown in Table 3. 

Wechsler intended the MQ to be “directly 
comparable” to W-B Full Scale IQ (11, p. 
90), and, in clinical practice, an MQ lower 
than the Full Scale IQ is frequently cited as 
evidence of brain damage. This practice ex- 


either the correlation between IQ and MQ or 
the size of difference that can be considered 
significant. The correlation coefficient of IO 
and MQ in this study is .767. The standard 
deviation of difference scores (IQ minus MQ) 
is 10.687; the range is from + 40 to — 21. 

In an effort to determine whether the mean 
difference between IQ and MQ varies with IQ 
level, an analysis of variance was done. The 
obtained F ratio, however, is not significant 
(F of 1.04 with 3 and 146 df). 


Discussion 
Use of WMS in Diagnosis 


Although the diagnostic usefulness of the 
WMS was not the focus of this study, two of 
the findings are pertinent to the validity of 
some of the assumptions underlying present 
diagnostic practice: 


1. The low intercorrelations among WMS 
subtests indicate that score differences are so 
common and so large that they have no prac- 
tical predictive value. In the voluminous lit- 
erature on learning tasks (and most of the 
WMS subtests are simple learning tasks) low 
intercorrelations are consistently reported (3, 
6). The findings in the present study, there- 
fore, cannot be explained as nothing more 











134 


than the natural consequence of a preponder- 
antly neurotic sample. The assumption that, 
in the absence of demonstrable brain pa- 
thology, individuals will perform at approxi- 
mately the same level on all the WMS sub- 
tests is not tenable. 

2. In 24% of the cases in the study re- 
ported here, IQ exceeded MQ by 24 and more 
points. Such large deviations occurring in a 
group so carefully screened for brain damage 
indicate that the degree of comparability be- 
tween the two quotients is not great enough 
for use in individual diagnosis. 


Reliability 


The reliability coefficients indicate that the 
subtests and, in several instances, the subtest 
items have a low degree of measurement 
equivalence. Two of the subtests, Mental Con- 
trol and Associate Learning, have so little in- 
ternal consistency that it is difficult to justify 
the segregation of their items into subtests. 
Observation of patients suggests that most of 
the variance which occurs in Mental Control 
is error variance—variance contributed by 
such a startle reaction as: “Say my ABC’s! 
The doctor must really think I am crazy!” 

The degree of internal consistency that is 
optimum for a test is, of course, an empirical 
question and cannot be determined without 
specification of what the test is intended to 
predict. Unfortunately, memory tests have 
been constructed in the absence of any ap- 
preciable effort to describe and analyze the 
aspects of human behavior which the term 
memory function designates. (For further dis- 
cussion of the definition of memory, see Ing- 
ham [7].) Since the WMS is seemingly based 
on an ambiguous “common-sense” definition 
of memory, it would be difficult to ascertain 
just what degree of reliability would be opti- 
mum. When one takes into account, however, 
the relatively low reliability and extreme brev- 
ity in conjunction with the range of function 
level which the test attempts to encompass, 
it seems highly unlikely that the WMS is ca- 
pable of yielding data which are sufficiently 
accurate to be useful in predicting behavior. 

The level of subtest difficulty is so dis- 
parate that it is questionable whether the en- 
tire test should be given to many individuals. 


Julia C. Hall 


Observation of the reaction of patients sug- 
gests that several of the subtests (viz., Infor- 
mation, Orientation, Mental Control, and the 
Easy Words item of Associate Learning) are 
so simple that they are appropriate only in 
those instances in which there is gross disor- 
ganization of intellectual processes. Individu- 
als who, in a hospital setting, are asked to 
complete items which “any child would know” 
are apt to be either frightened to death or in- 
sulted by the implications of the request. 
Neither reaction is conducive to securing data 
which are representative of the individual’s 
behavior under more usual circumstances. 
The analysis of the WMS and of the types of 
problems with which clinical psychology is 
concerned suggests that improvement in the 
predictive accuracy of psychological tests ne- 
cessitates the abandonment of tests which are 
measures-of-everything and the construction 
of a variety of univocal tests, each test de- 
signed to secure samples of very limited and 
carefully specified aspects of human function. 


Correlation of WMS with W-B 


Wechsler says that, in the standardization 
group of “about 100” with an age range from 
25 to 50, WMS age group means “paralleled 
very closely that of the Performance part of 
the Bellevue” (11, p. 88). From the context 
in which the statement occurs, one infers that 
the memory score showed an age decline curve 
similar to that of W-B Performance. In the 
relatively young age group studied here, how- 
ever, the relationship between WMS and W-B 
Verbal is greater than that with W-B Per- 
formance (see Table 3). It is, of course, pos- 
sible that in older age groups the relationship 
is reversed. 

The overlap in measurement between WMS 
and W-B is fairly large (r of Memory score 
and W-B Full Scale score is .750). In view of 
the scale’s brevity, relatively low internal con- 
sistency, and many sources of error variance, 
one must ask: Once the communality with 
W-B is removed, is the remaining variance 
of WMS anything other than error variance? 
Unless it can be demonstrated empirically 
that WMS contributes unique and true vari- 
ance to studies of human behavior, one can- 
not justify giving both tests. 








Summary and Conclusions 


The WMS reliability, subtest intercorrela- 
tions, and correlation with W-B were investi- 
gated. The following findings are reported. 


1. The reliability coefficients (alpha coeffi- 
cient of internal consistency) of WMS sub- 
tests range from .368 to .817. That of the 
scale is .696. 

2. The WMS subtest intercorrelations range 
from .268 to .776; seven of the ten coeffi- 
cients (rs) are between .30 and .40. The low 
intercorrelations indicate that the assumption 
that, in the absence of brain pathology, indi- 
viduals tend to do approximately the same on 
the various subtests is untenable. 

3. The correlation coefficients of the WMS 
score with W-B Verbal, Performance, and 
Full Scale scores are .690, .536, and .750, re- 
spectively; that of MQ and IQ, .767. The 
degree of correlation with W-B Full Scale in 
conjunction with the low reliability and brev- 
ity of WMS raise question as to the justifi- 
cation for giving both tests. 

4. The standard deviation of the differences 
between IQ and MQ is 10.687. 


Received June 29, 1956. 


Reliability of the Wechsler Memory Scale 


10. 


12. 





. Morrow, R. S., & Mark, J. C 


References 


. Cohen, J. Wechsler Memory Scale performance 


of psychoneurotic, organic, and schizophrenic 
groups. J. consult. Psychol., 1950, 14, 371-375 


. Cronbach, L. J. Coefficient alpha and the inter 


nal structure of tests. Psychometrika, 1951, 


16, 297-334. 


. Hall, C. S. Intercorrelations of measures of hu- 


man learning. Psychol. Rev., 1936, 43, 179 
196. 


. Howard, A. R. Diagnostic value of the Wechsler 


Memory Scale with selected groups of institu 
tionalized patients. J. consult. Psychol., 1950 
14, 376-380. 


. Howard, A. R. Further validation studies of the 


WMS. J. clin. Psychol., 1953, 10, 164-167 

. Husband, R. W. Intercorrelations among learning 
abilities. J. genet. Psychol., 1939, 55, 353-364 

. Ingham, J. G. Memory and intelligence. Brit. J 
Psychol., 1952, 43, 20-31 

. Kogan, Kate L. Wechsler Memory Scale. In 


O. K. Buros (Ed.), The third mental meas- 

urements yearbook. New Brunswick: Rutgers 

Univer. Press, 1949. Pp. 398-399 

The performance 
of brain-damaged patients on the Wechsler 
Memory Scale. Amer. Psychologist, 1955, 10 
324. (Abstract) 

Wechsler, D. The measurement of adult intelli- 
gence. Baltimore: Williams & Wilkins, 1939 


. Wechsler, D. A standardized memory scale for 


clinical use. J. Psychol., 1945, 19, 87-95. 

Wechsler, D. The Wechsler-Bellevue Intelligence 
Scale, Form II. New York: Psychological 
Corp., 1946. 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Subtest Disparity of Negro and White Groups 
Matched for IQs on the Revised Beta Test’ 


Walter A. Woods 
Nowland & Company 


and Robert Toal * 
Medical College of Virginia 


The report of a study by Woods, Boger, 
and Holman (3) revealed that subtest scaled 
scores of the Revised Beta Examination con- 
tributed unequally to total test scores of 
Negro adolescents. Nondelinquents attained 
higher scores than did delinquents, but in 
both groups it was apparent that some sub- 
tests contributed greater weight than others. 
No significant differences were found between 
sexes. 

Two alternatives appeared to provide pos- 
sible explanations for the subtest disparity. 
Since the sample was substantially below the 
norm group mean, although above the defec- 
tive level, it could be that the subtests con- 
tributed disproportionately among groups at 
the lower IQ levels. Yet a second possibility 
was that some influence was operative in Ne- 
gro groups, and absent in white (norm) 
groups, which brought forth differential re- 
sponses to the subtest items. 

The present study attempts to throw light 
upon these two possible alternatives. 


Experimental Design 


One hundred and twenty Beta scores of 
adolescent male and female Negro subjects, 
drawn from industrial school and public 
school populations, were matched with 120 
Beta scores of white adolescents, also drawn 
from industrial school and public school popu- 
lations. Matching was made on sex and on 
total score. Since the previous study had 


1 From research conducted while authors were af- 
filiated with Richmond Professional Institute. 


demonstrated that sex differences were not 
significant in the populations at issue, male 
and female subjects were used. All subjects 
were between the ages of 14 and 17, inclu- 
sive. No attempt was made to match for age, 
since for the purposes of our experiment age 
matching was unnecessary. 

Scores were classified according to six lev- 
els of performance, from high to low, for Ne- 
gro and white groups to facilitate a treat- 
ments-by-levels analysis of variance. Twenty 
Negro and 20 white scores were assigned to 
each level. The subtest scores were regarded 
as treatments: an assumption of subtest 
equality is intrinsic in the scaling, thus meet- 
ing the assumption of equality of subtest 
means in the population. 

Negro and white performance on each sub- 
test was compared by computing differences 
between group means on each subtest and 
ascertaining ¢ ratios. 


Results 


Table 1 summarizes the mean scores for 
the subtests, by levels, for Negro and white 
groups. Total means are equal for Negroes 
and whites at each level, since the groups 
were matched on total score. The subtest 
means at each level support the findings of 
the former study, indicating disproportionate 
contributions of the subtests at different lev- 
els. When we compare the column means by 
t ratio, we find that the subtest contributions 
are not the same for Negroes as for whites. 
Whites perform better on Subtests 3 (detec- 


136 


rl 


2 a 








Subtest Disparity of White and Negro Groups 


Table 1 






Mean Scaled Scores of Negro and White Adolescent Groups Classified by Six Levels on 
Total Scaled Scores on the Revised Beta Examination 








* Significant at .05 level. 


tion of errors), 4 (paper form board), and 5 
(drawing completion). All of these differences 
exceed a 95 in 100 chance expectancy. Ne- 
groes perform better than whites on Subtests 
2 (digit symbol), and 6 (visual comparison). 
These differences also exceed those expected 
by chance at the 95% level. 

From analysis of variance it is found that 
first-order interaction effects of subtests with 
levels, and of subtests with race, are present to 
a significant degree. This is shown in Table 2. 
Treatments effects and levels effects are also 
significant. Interaction of race and levels is 
nonsignificant. The levels effect is, of course, 
intrinsic in the design. The significant sub- 
tests effect supports the hypothesis that the 
subtest contribution is unequal. 

An examination cf the mean scores by 
level for each subtest in the white groups in- 
dicates that, although the contributions are 
unequal at each level, they increase propor- 
tionately with increase in total score and tend 
to become equal at the higher levels. On the 


Subtest Mean 


Total er _ 
Level Race Mean 1 2 3 4 5 6 
, Negro 7.1 5.1 8.4 8.4 4.5 7.2 9.0 
White 7.1 5.8 7.9 8.0 48 7.7 8.1 
. Negro 8.4 7.7 10.7 8.6 4.9 8.2 10.2 
= White 8.4 6.9 9.6 10.0 6.3 8.9 8.8 
, Negro 9.2 8.7 11.2 9.6 5.6 9.6 10.6 
. White 9.2 8.2 10.4 9.5 7.4 98 10.1 
F Negro 9.8 8.7 13.0 10.0 6.2 98 11.4 
' White 9.8 8.9 10.5 10.6 7.9 10.6 10.4 
5 Negro 10.4 9.3 12.4 10.9 7.9 10.7 11.4 
’ White 10.4 9.4 11.5 11.6 5.8 11.2 10.2 
‘ Negro 11.5 10.9 13.9 11.2 92 11.3 2.7 
: White 11.5 10.7 11.7 12.0 10.9 12.3 11.6 
Mean of means 
Negro 94 8.4 11.6 98 6.4 94 10.9 
White 94 8.3 10.2 10.3 7.7 10.1 9.9 
Difference , 1.4 5 1.3 7 1.0 
t 245 4.58* 1.96* 3.92* 2.15* 3.71* 


other hand, the increase in Negro scores is 
not consistent for the several subtests, since 
some scores increase at a greater rate than 
others. A significant second-order interaction 
effect of levels with race and with subtest 
provides some explanation for this different 


Table 2 
Analysis of Variance of 120 White and 120 Negro 
Revised Beta Scores Classified by Sub 
test, by Level, and by Race 


Mean 
Source df Square F 

Subtests 5 501.34 199.36* 
Levels 5 581.83 304.19" 
Race 1 .00 00 
Subtests by Levels 25 46.56 17.8* 
Subtests by Race 5 61.09 23.4* 
Levels by Race 5 00 00 
Levels by Subtests by Race 25 9.58 3.67* 
Individual Differences 1372 2.61 





* Significant at .05 level. 





138 Walter A. Woods and Robert Toal 


pattern among the white as contrasted to the 
Negro group. Race with levels interaction is 
not significant, while race with subtest is. 
Thus, the second-order interaction is explain- 
able only in terms of the disproportionate in- 
crease in the Negro group. This dispropor- 
tionality appears to be produced by several 
of the subtests—more markedly by Subtests 
3 and 6, the means of which increase with 
smaller increment than do the means of the 
other subtests; and Subtests 1 and 4, which 
increase with greater increment. 

These findings indicate that scaled scores 
on the Beta Examination are disproportion- 
ate due to level—some of the subtests con- 
tributing more, others less, at lower levels of 
ability—and due to racial (and this may be 
cultural) differences. This latter hypothesis 
is supported both by the disproportionality 
at different levels, particularly at lower lev- 
els, and by the second-order interaction. 

The tests on which Negroes tend to per- 
form better are essentially tests which re- 
quire perceptual speed and accuracy. It has 
been reported in the literature on racial in- 
telligence differences (1, p. 491) that Ne- 
groes in our society seem to have little in- 
centive to do things rapidly. In our groups 
matched on tota! score, our Negro sample 
was not handicapped by this supposed dis- 
inclination to work rapidly. 

Two of the subtests on which Negroes are 
inferior to whites (3 and 5) are culturally 
“loaded”; that is, they contain items which 
appear to be common in our culture, but are 
not equally common to all cultural segments. 
The third, Subtest 4 (paper form board), 
seems related to judgments involving spatial 


visualization. In conceptualizing the visual 
material presented in this subtest, and men- 
tally manipulating it, a higher level cognitive 
process seems to be required, which is not 
needed for the successful performance on any 
of the other subtests. Thus, it appears that 
Negroes, when compared with whites of 
“equal ability,” are most deficient in cul- 
turally loaded items and in items which re- 
quire ability to visualize spatially. They seem 
superior to whites in items requiring per- 
ceptual speed and accuracy. 

We also find greater subtest disparity at 
lower levels for both Negroes and whites. 
Differences exist at lower levels which are 
not so much in evidence at higher levels. 
This was particularly apparent in the maze 
and paper form board tests, which are be- 
lieved to involve, respectively, ability to plan 
ahead and ability to mentally manipulate 
and to visualize spatial arrangements. 

It is suggested that other studies in which 
Negroes and whites are matched on particu- 
lar variables may prove revealing. 


Received June 29, 1956. 


References 


1. Davidson, K. S., Gibby, R. G., McNeil, E. B., 
Segal, S. J., & Silverman, H. A preliminary 
study of Negro and white differences on Form 
I of the Wechsler Bellevue Scale. J. consult. 
Psychol., 1950, 14, 489-492. 

2. Kellogg, C. E., Morton, N. W., Lindner, R. M., 
& Gurvitz, M. Manual for the Revised Beta 
Examination. New York: Psychological Corp., 
1946. 

3. Woods, W. A., Boger, J., & Holman, G. An in- 
vestigation of Revised Beta scores (Lindner 
Gurvitz Scale) among Negro adolescents. Va. 
J. Sci., 1954, 5, 321. 





: eRe 


SiS 


nies 





Tages 





oy 
4] 
re] 
; 





Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Revised Administration and Scoring of the 
Digit Span Test’ 







Harold L. Blackburn and Arthur L. Benton 


State University of Iowa 


It is well known that, under standard con- 
ditions of administration (e.g., in the WAIS), 
the auditory-vocal digit span test has a rela- 
tively low reliability. For example, Derner, 
Aborn, and Canter (1), studying the test- 
retest reliability of the Wechsler-Bellevue sub- 
tests in a sample of 158 normal Ss, found the 
correlation coefficient between test and retest 
to be .67, this value ranking ninth among the 
11 subtests of the scale. Moreover, in sum- 
marizing the findings of seven studies dealing 
with the performances of neuropsychiatric pa- 
tients, these authors report a median test-re- 
test correlation coefficient of .65 for the digit 
span, an estimate which is hardly suggestive 
of satisfactory reliability. Yet the test plays 
an important role in clinical diagnosis, par- 
ticularly as an “anxiety indicator” and as a 
basis for inferences about the presence of 
cerebral injury or disease. 

It seems reasonable to suppose that the va- 
lidity of the digit span for these and other 
purposes would be enhanced if a procedure 
that yielded a more reliable estimate of the 
abilities involved in this performance were 
utilized. Augmentation of the test-retest re- 
liability of a task such as this can be rather 
easily accomplished by increasing the number 
of trials, adopting a systematic order of pres- 
entation and determining a relatively stable 
threshold value. However, while this psycho- 
physical procedure is appropriate for experi- 


1 This investigation was supported by a research 
grant (B-616) from the National Institute of Neu- 
rological Diseases and Blindness, of the National In- 
stitutes of Health, Public Health Service. 

The writers are greatly indebted to Dr. Leonard S. 
Feldt for valuable suggestions and criticisms and to 
Mr. Richard C. Jentsch for assistance in the collec- 
tion of the data. 


139 


mental work, it is much too time-consuming 
for routine clinical use. On the other hand, it 
is possible that some relatively slight modifi- 
cations in procedure and scoring, which would 
not necessarily involve an important increase 
in administration time, might provide esti- 
mates that are significantly more reliable than 
those secured with the standard administra- 
tion and scoring. 

This possibility was explored in the present 
study which investigated the effects of two 
minor procedural modifications and a revision 
in scoring on the test-retest reliability of the 
task. The procedural modifications consisted 
of: (a) having S repeat or reverse both sets 
of digits of a given series length even when he 
had correctly repeated or reversed the first 
set of the pair; and (5) terminating the repe- 
tition or reversal of digits after three succes- 
sive failures rather than two. The revision in 
scoring consisted in giving credit for each set 
of digits correctly repeated or reversed rather 
than by the conventional “highest score” 
method.? 


Procedure 


The digit span performances of three main 
groups of Ss were investigated: (a) 100 col- 
lege students; (5) 105 nonpsychotic patients 
on various services of the University Hospital 
and Veterans Administration Hospital, Iowa 
City; (c) 61 patients with confirmed or sus- 
pected cerebral disease who were on the neu- 
rological or neurosurgical services of these 
hospitals. 


2 Justification for this revision in scoring is pro- 
vided in the intensive study of digit span perform- 
ance by Peatman and Locke (3), who found that 
it yielded consistently higher test-retest reliabilities 
than did the “highest score” method. 


140 


Approximately half the Ss in each group re- 
ceived the WAIS administration of the digit 
span. Following interpolated tasks of 20-30 
minutes duration, they were retested with the 
same administration. The other half of each 
group received the revised administration 
(hereafter referred to as the BB administra- 
tion) of the digit span under the same con- 
ditions, i.e., retest after an interval of 20-30 
minutes. Each type of administration could 
be further subdivided into two types, accord- 
ing to whether two or three successive fail- 
ures were taken as the criterion for terminat- 
ing the task of repeating or reversing the 
digits. Thus, four administrative procedures 
were available for comparison with respect to 
the test-retest reliability of the performance 
estimates, viz.: 

WAIS II; Standard WAIS administration. 

WAIS III; Standard WAIS administration 
except that the task is terminated after three 
successive failures. 

BB II; Requiring S to repeat or reverse 
both sets of digits of a given series length and 





Harold L. Blackburn and Arthur L. Benton 


termination of the task after two successive 
failures. 

BB III; BB II procedure, except that the 
task is terminated after three successive fail- 
ures. 

Scoring. Performances under the WAIS ad- 
ministrations were scored in the usual man- 
ner, i.e., the “highest score’ method. Perform- 
ances under the BB administrations were 
scored in terms of the number of sets of digits 
correctly repeated or reversed, one point be- 
ing given for each correctly reproduced set, 
always beginning with three digits forward 
and two digits backward. No basal score was 
added to this simple frequency count. 


Results 


The mean ages and educational levels of 
the Ss in the several subgroups are presented 
in Table 1. It is evident from inspection that, 
within each group, the two subgroups who re- 
ceived the different administrations are fairly 
comparable with respect to these character- 
istics. The three largest differences (ages in 


Table 1 





Group N 
College WAIS 50 Mean 
SD 
College BB 50 Mean 
SD 
Patients WAIS 56 Mean 
SD 
Patients BB 49 Mean 
SD 
Brain-Injured WAIS 30 Mean 
SD 
Brain-Injured BB 31 Mean 
SD 
Total WAIS 136 Mean 
SD 
Total BB 130 Mean 
SD 





Age, Educational Background and Level of Test Performance in the Several Groups 


Initial Test 


Age Education ——_—_—___——_-—- 

(years) (years) II Score III Score 
20.9 13.2 11.7 12.0 
3.7 1.1 2.1 2.1 
19.8 13.1 15.7 16.1 
1.8 1.0 3.7 3.7 
39.9 9.6 10.2 10.4 
11.4 2.2 y dy 2.2 
36.6 98 13.1 13.3 
10.9 2.5 3.2 3.2 
41.6 10.5 9.1 9.5 
13.0 3.9 2.1 2.2 
37.5 9.7 10.5 10.5 
11.9 2.3 3.0 3.0 
33.3 11.1 10.5 10.8 
13.6 2.9 2.4 2.4 
30.3 11.0 13.5 13.7 
12.2 2.6 3.9 4.0 


























Table 2 


Test-Retest Reliability Coefficients Under the Four 
Administrative Procedures 


Procedure 
WAIS WAIS' BB BB 
Group Il Ill II Ill 
College 65 .63 .79 81 
Nonpsychiatric Patient 71 61 81 82 
Brain-Injured 75 72 80 .79 
Combined 70 64 80 81 


the nonpsychiatric patient and brain-injured 
patient groups; educational levels in the 
brain-injured group) were analyzed by means 
of the ¢ test and found to be nonsignificant. 

Table 2 shows the test-retest reliability co- 
efficients for each group under each type of 
administration. It will be noted that the re- 
liability coefficients resulting from the BB 
administrations are consistently higher than 
those yielded by the WAIS administrations. 
It is also evident that employing the criterion 
of three successive failures for terminating the 
task does not improve test-retest reliability. 

The homogeneity of the sets of three test- 
retest reliability estimates for each adminis- 
tration was assessed by a chi-square test de- 
scribed by Rider (4). Since all four tests 
yielded nonsignificant chi-square values, each 
set of estimates was combined by the method 
of average zs and a single reliability coeffi- 
cient for each administration obtained. These 
total reliability estimates are also shown in 
Table 2. Assessment of the significance of 
the differences in the size of these total reli- 
ability coefficients, based on a comparison of 
the differences in zs with the standard error 
of their difference, showed that the BB II 
administration was more reliable than the 
WAIS II administration to a questionably 
significant degree (.10 > » > .05), that the 
BB III administration was significantly more 
reliable than the WAIS II administration 
(.05 > p > .02), and that both BB adminis- 
trations were significantly more reliable than 
the WAIS III administration (.01 > p> 
001). 

It is possible that the higher reliability of 
the BB procedures results from the introduc- 


Revised Administration of the Digit Span Test 





141 






tion of new elements into the task that can 
be more reliably measured. To investigate this 
possibility, an estimate of the correlation be- 
tween the two types of scores was secured 
from a separate study of the WAIS II and 
BB III performances of 98 college students, 
half of whom received the WAIS II adminis- 
tration followed, after a 20-30 minutes pe- 
riod, by the BB III administration, and the 
other half of whom received the two adminis- 
trations in reverse order. The correlation co- 
efficient between the two administrations was 
.765 for the first half-group and .767 for the 
second half-group. Correction for attenuation 
due to lack of perfect reliability of the meas- 
ures raised this correlation coefficient to .96, 
suggesting that the two tests are imperfectly 
reliable measures of the same true component 

Determinations of administration time. 
Mean time for the standard (WAIS II) ad- 
ministration was found to be 3 min., 46 sec. 
in a group of eight college students. Mean 
time for the WAIS III administration was 
found to be 4 min., 4 sec. in this group of Ss 
In a second group of seven college students, 
who were comparable in respect to level of 
performance with the first group, the mean 
time for the BB II administration was found 
to be 4 min., 46 sec., exactly a minute longer 
than for the standard administration. Mean 
time for the BB III administration was found 
to be 5 min., 7 sec. for these Ss. Thus, as 
compared with the standard administration, 
the BB III administration involved an in- 
crease of 1 min., 21 sec. in administration 
time. 

Equivalent BB III and WAIS II scores. To 
facilitate practical use of the BB III adminis- 
tration, a table of equivalent BB III—-WAIS 
II scores was constructed by equating the two 
sets of initial scores (136 WAIS scores and 
130 BB scores) through employment of the 
equipercentile method applied to estimated 
distributions of true scores, as described by 
Flanagan (2). This table of equivalent BB ITI 
and WAIS II raw scores, together with the 
corresponding WAIS scaled score equivalents, 
is presented in Table 3. In this table, the 
WAIS II equivalent raw scores within the 
range of 6 to 26 were derived empirically. 
Outside this range, they were derived by logi- 
cal extrapolation. The SD of the WAIS II 


142 Harold L. Blackburn and Arthur L. Benton 


equivalent raw scores was found to be 4.51, 
which was not significantly lower than the 
SD (5.65) of the original WAIS II raw 
scores, indicating that the transformation did 
not result in an important reduction in vari- 
ability as would have been the case had it 
been effected by means of a regression equa- 
tion. 

Repetition and reversal of digits. The total 
digit span test was separated into its “for- 
ward” and “backwards” components and the 
test-retest reliabilities of each component un- 
der the WAIS II and BB ITI administrations 
compared. Under the WAIS II administra- 
tion the test-retest reliability coefficient of 
digits forward was found to be .51. Under the 
BB III administration the same statistic was 
.75. Under the WAIS II administration the 
test-retest reliability coefficient for reversal 


Fable 3 


Transformation Table for Prediction of WAIS II Raw 
Scores and WAITS Scaled Score Equivalents 
from BB IIT Raw Scores 


WAIS I 
BB ILI WAIS II Scaled Score 
Raw Score Raw Score Equivalent 
28 17.0 19 
27 16.9 18 
26 16.8 17 
25 16.0 16 
24 15.6 16 
23 15.3 15 
22 14.8 15 
21 14.6 14 
20 14.2 14 
19 13.6 13 
18 12.8 12 
17 12.3 il 
16 11.7 11 
15 11.2 10 
14 10.7 10 
13 10.3 9 
12 9.6 8 
11 9.0 7 
10 8.5 7 
9 8.0 6 
8 7.2 5 
7 6.8 4 
6 6.4 3 
5 6.0 2 
4 5.5 2 
3 5.0 1 
2 4.0 1 
1 3.0 0 





of digits was .64, Under the BB III adminis 
tration it was .71 

Observations on children. The test-retest 
reliabilities of the four administrations were 
examined in a group of 77 children within 
the age range of 6-12 years, 39 of whom re 
ceived the WAIS test-retest procedure and 38 
of whom received the BB test-retest pro 
cedure. The obtained reliability coefficients 
were as follows: WAIS II: .76: WAIS Ill: 
.77; BB II: 88; BB IIL: 89 

Observations on mental defectives. Test-re 
test reliability of performance under the four 
conditions of administration was also investi 
gated for 77 mental defectives, 39 of whom 
received the WAIS test-retest procedure and 
38 of whom received the BB test-retest pro 
cedure. The obtained reliability coefficients 
were as follows: WAIS Ii: 89; WAIS III 
88; BB II: .94; BB III: .95. 


Discussion 


The findings on the main groups of adult 
Ss indicate that slight modifications in the 
administration and scoring of the digit span 
test result in a significant increase in the test- 
retest reliability of the task. Findings on 
smaller groups of children and mental defec 
tives are consistent with this conclusion. With 
the standard administration, the test-retest 
reliability in a heterogeneous group of adult 
Ss was found to be .70. With the revised ad- 
ministration, the reliability coefficient in a 
comparable group of Ss was found to be .80 
— 81. Since the revised administrative pro- 
cedure generally involved about an additional 
minute in administration time, its employ- 
ment in the clinical examination would seem 
to be indicated. 

The table of equivalent scores is designed 
to facilitate routine clinical use of the revised 
administration. While in all probability either 
the BB II or the BB III administrative pro- 
cedure would serve equally well for the pre- 
diction of WAIS II raw score equivalents, 
certain minor considerations made it seem 
wise to select the latter for this purpose. Its 
test-retest reliability was found to be slightly 
higher than that of the BB II procedure. 
Moreover, in the main comparison of reli- 
abilities, the BB III procedure was more re- 
liable than the WAIS II procedure to a degree 


REEL I. TER Te 


» nce Rte t+ 


LAE LA 


AOA NEM ot 





I -_ 


as ra 





PED: SEE 


egvnapig Rite - 


1 AAAS 


LOD 1 ET 2. 





Revised Administration 


that was clearly acceptable (p < .05) while 
the difference in test-retest reliability between 
the BB Il and WAIS II procedures did not 
quite reach the .0S level in respect to signifi- 
cance. Since the difference in administration 
time between the two BB procedures was 
found to be minimal (mean, 21 sec.), no 
practical disadvantage is associated with em- 
ployment of the BB III procedure. 

The transformation table is designed ex- 
pressly for the interpretation of the perform- 
ances of adult Ss who are not grossly defec- 
tive. There is no justification at this time for 
its employment with children or mental de- 
fectives. 

Analysis of digit span performance into the 
separate components of repetition and re- 
versal of digits indicated that the superiority 
of the BB administration was based on the 
fact that it effected an increase in the reli- 
ability of the task of repetition of digits while 
the relatively low reliability of the task of 
reversing digits remained essentially un- 
changed. It must be concluded that the su- 
periority of the BB administration with re- 
spect to test-retest reliability applies only to 
the digit span test as a whole and not to 
any comparisons of digits forward and digits 
backward which may be made. Since this type 
of comparison is often made by the clinical 
examiner, exploration of further procedural 
modifications designed to augment the reli- 
ability of the reversal of digits task ought to 
be attempted. 


Summary 


1. This study explored the effect of certain 
modifications in administration and scoring on 
the test-retest reliability of the digit span. 
These modifications consisted of: (a) having 
S repeat or reverse both sets of digits of a 
given series length even when he had cor- 








of the Digit Span Test 143 


rectly repeated or reversed the first set of the 
pair; (5) terminating the repetition or re- 
versal of digits after three successive failures 
rather than two; (c) giving credit in scoring 
for each set of digits correctly repeated or 
reversed rather than by the usual “highest 
score” method. 

2. It was found that these modifications re- 
sulted in a significant increase in the test-re- 
test reliability of the task. This increase in 
reliability was effected primarily through aug- 
mentation of the reliability of performance on 
the “digits forward” component of the task. 
Under both the standard and revised adminis- 
trations, the “digits backward” component 
showed unsatisfactory reliability. 

3. The time required for the revised ad- 
ministrations was found to be approximately 
80 sec. longer than that required for the 
standard administration. 

4. In order to facilitate clinical use of the 
revised administration a transformation table 
was constructed whereby both raw scores de- 
rived from the standard administration and 
WAIS scale score equivalents can be predicted 
from raw scores derived from the revised ad- 
ministration. 


Received July 26, 1956. 


References 


1. Derner, G. F., Aborn, M., & Canter, A. H. The 
reliability of the Wechsler-Bellevue subtests 
and scales. J. consult. Psychol., 1950, 14, 172 
179. 

2. Flanagan, J. C. Units, scores and norms. In E. F. 
Lindquist (Ed.), Educational measurement. 
Washington, D. C.: American Council on 
Education, 1951. Pp. 695-763. 

3. Peatman, J. G., & Locke, N. M. Studies in the 
methodology of the digit span test. Arch. Psy- 
chol. N. Y., 1934, No. 167. 

4. Rider, P. R. An introduction to modern statistical 
methods. New York: Wiley, 1939. 


Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


The Relationship of the WISC and Stanford-Binet 
to School Achievement’ 


Ernest S. Barratt and Doris L. Baumgarten 


University of Delaware 


The Stanford-Binet and the WISC are both 
widely used in clinics and schools to estimate 
intellectual capacity. This study is designed 
to relate scores on the WISC and the 1937 
Revision of the Stanford-Binet (Form L) to 
scores on the California Achievement Tests 
(reading and arithmetic subtests) for 30 
achievers and 30 nonachievers in grades 4 to 
6. Achievers and nonachievers were defined by 
teachers’ ratings of students’ school perform- 
ance. 

The results of achievers on the WISC were: 
Full Scale (FS) 10, M = 117.47, o= 9.81; 
Verbal IQ (V), M = 121.17, o = 10.30; Per- 
formance IQ (P), M= 110.10, o= 11.46. 
Results of nonachievers were: FS, M = 86.90, 
o = 12.46; V, M = 88.23, o = 13.07; P, M 
= 91.50, o=11.69. On the Binet, the 
achievers’ mean was 126.47, o = 11.99; the 
nonachievers’ mean was 88.27, o = 13.28. 

The rs between reading achievement and 
IQ scores of achievers were: WISC FS, .56; 
WISC V, .61; WISC P, .29; Binet, .62. For 
nonachievers these rs were: WISC FS, .63; 
WISC V, .51; WISC P, .30; Binet, .46. 

The rs between arithmetic achievement and 
IQ scores for achievers were: WISC FS, .14; 
WISC V, .09; WISC P, .14; Binet, .12. For 
nonachievers these rs were WISC FS, .79; 
WISC V, .73; WISC P, .33; Binet, .52. 


1An extended report of this study may be ob- 
tained without charge from Ernest S. Barratt, De- 
partment of Psychology, University of Delaware, 
Newark, Delaware, or for a fee from the American 
Documentation Institute. Order Document No. 5106, 
remitting $2.00 for microfilm or $3.75 for photo- 
copies. 


The rs between WISC subtests and reading 
achievement for achievers were: Inform., .58; 
Comp., .22; Arith., .50; Simil., .61; Vocab., 
.46; Digit S., .18; P. Comp., — .15; P. Arr., 
32; Bl. Dsg., .52; O. Assemb., .36; Coding, 
.15; Mazes, — .02. For nonachievers these rs 
were: Inform., .16; Comp., .09; Arith., .41; 
Simil., .23; Vocab., .25; Digit S., .09; P. 
Comp., .20; P. Arr., .22; Bl. Dsg., .00; O. 
Assemb., .36; Coding, .46; Mazes, .00. 

The rs between arithmetic achievement and 
IQ scores for achievers were: Inform., .11; 
Comp., — .02; Arith., .18; Simil., .11; Vocab., 
— .54; Digit S., — .04; P. Comp., .04; P. 
Arr., .11; Bl. Dsg., .15; O. Assemb., .18; 
Coding, .12; Mazes, .00. For nonachievers 
these rs were: Inform., .38; Comp., .42; 
Arith., .43; Simil., .43; Vocab., .51; Digit S., 
— .07; P. Comp., .35; P. Arr., .39; Bl. Dsg., 
.08; O. Assemb., .25; Coding, .34; Mazes, 
03. 

For achievers, one intelligence test is not 
a better predictor than the others of either 
reading or arithmetic achievement. The same 
conclusion is true for the nonachievers except 
for the WISC P score which is not as highly 
related to arithmetic achievement as are the 
WISC V and FS scores. 

Several observations lead to the general 
conclusion that obtaining a true measure of 
arithmetic ability in nonachievers is difficult 
because of their difficulties in using verbal 
symbolism. One distinguishing characteristic 
of the achievers in this study is their rela- 
tively high verbal ability. 

Brief Report. 
Received December 4, 1956. 


144 





ba sf Sian 


MEE Pe SABE SS 














POCMEEEDE 22 NES 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 








Self-Acceptance and Psychopathology 


Marvin Zuckerman’ and Irwin Monashkin 
Larue D. Carter Memorial Hospital, Indianapolis, Ind. 


Rogers (7) has suggested that self-accept- 
ance is a good criterion for progress in psy- 
chotherapy. Supporting this idea is the find- 
ing that the self concept becomes more like 
the ideal concept during the course of psy- 
chotherapy. Implicit in it is the assumption 
that self-acceptance varies directly with ad- 
justment. In a later work (8), Rogers does 
comment that high self-ideal correlations may 
indicate either genuine adjustment or defen- 
sive behavior. However, he still regards the 
rise in self-ideal correlations during psycho- 
therapy as indications of increased adjust- 
ment in the group. 

Berger (1), and Block and Thomas (5) 
found that self-acceptance in college students 
was negatively correlated with certain clinical 
scales of the MMPI and positively correlated 
with the K scale. If the clinical scales are 
considered criteria of adjustment, the correla- 
tions between self-acceptance and these scales 
would suggest that self-acceptance is related 
to adjustment. However the positive corre- 
lation between self-acceptance and the K 
scale, which has been interpreted as a meas- 
ure of defensiveness, suggests another hy- 
pothesis. The self-satisfied subjects may be 
maladjusted but defensive or lacking insight 
into their condition. Block and Thomas (5) 
seem to favor this latter hypothesis. They 
view the extremely self-accepting subjects as 
“overcontrolled” and denying. If this view is 
correct it may be possible to view the correla- 
tions between clinical MMPI scales and self- 
acceptance as indications of the common in- 
fluences of defensive processes rather than 
the common influence of underlying adjust- 
ment. 

The purposes of this study are: 


1Now at the Institute for Psychiatric Research, 
Indiana University Medical Center, Indianapolis, Ind. 





145 


1. To see if the relationships found be- 
tween self-acceptance and particular MMPI 
scales in college students can be replicated in 
a sample of psychiatric patients. 

2. To see if there is any relationship be- 
tween self-acceptance and adjustment using 
an external criterion: a rating of adjustment 
based on the case history. 


Subjects 


The subjects were 43 psychiatric pa- 
tients, including 18 men and 25 women. 
They were all new admissions who had taken 
the MMPI and Shipley-Hartford Vocabulary 
scales. Those with below average Shipley 
scores were not used. Eighteen of the patients 
were diagnosed as psychoneurotic or person- 
ality trait disorder, 22 were diagnosed as 
schizophrenic, and 3 were diagnosed as psy- 
chotic depression. The mean age was 34; 
mean education was 12 years of school; mean 
vocabulary score was 29.3 (equivalent to an 
IQ of about 111). 


Procedure 


The scale used to measure conceptions of 
self and ideal was adapted from a scale de- 
veloped by Buss at this hospital. It consists 
of 16 subscales covering clinically relevant 
dimensions. A list of eight scaled adjectives 
describes the points on each subscale. The S 
rates by choosing one of the eight words 
which he feels is most descriptive of himself, 
and the scale value of that word is used as his 
score for the self concept on that subscale 
The Ss were first asked to rate themselves as 
they are “in general.” On a second copy of 
the scales they were asked to rate themselves 
as they “would like to be.” For each S a dis- 
crepancy score was computed by subtracting 
the scale values of the ratings on the ideal 


146 Marvin Zuckerman and Irwin Monashkin 


concept from the corresponding ratings on the 
self concept. The signs of these differences 
were ignored and the summed differences 
were the discrepancy scores. The larger the 
difference between ideal and self, the more 
dissimilar are the conceptions, and the less 
accepting the individual is of himself. Self- 
acceptance, as defined, varies inversely with 
the size of the discrepancy score between self 
and ideal. 

Case-history rating. Adjustment was de- 
fined by ratings of the patients based on 
their case histories. The writers independently 
rated the final case summaries in the patients’ 
charts on a 7-point scale of adjustment. Fac- 
tors considered in the global rating were 
severity of symptoms, bizarreness of idea- 
tion, degree of incapacitation, acuteness, or 
chronicity of the disorder, adequacy of pre- 
morbid adjustment, and course of the dis- 
order. In general, psychotics tended to fall at 
the upper end of the scale and neurotics at 
the lower end, with some overlap in the mid- 
dle between severe neurosis and mild psy- 
chosis. 

Interrater reliability for the 43 cases was 
.77. Using the same rating on 60 cases in an- 
other study a reliability of .80 was obtained. 
The ratings of the two judges were averaged 
for each patient to get the case-history rating. 


Table 1 


Correlations Between Self-Acceptance 
and MMPI Scales 








Block Zuckerman 
and and 








Berger Thomas Monashkin 
Sample Students Students Students Patients 
Sex Men Women M&W M&W 
N 109 76 56 43 
F _— —_ —.54* © —.54* 
K 58* 57* sy 38* 
Hs —.08 — .25* — .59* —.31* 
D —45* —.S4* —.63* 3 —.42* 
Pd — .03 — .26* —.62* —.13 
Pa —30* —.29* —.11 —.39* 
Pt —.52* —.55* — .69* —.51* 
Se -40* —.49* — .63* — .53* 
Ma —.11 —.12 —.22 +.16 
Si —.63* —.70° — — 52° 





* Significant at or below the .05 level. 


Results 


The correlations between self-acceptance 
and MMPI scales obtained in our patient 
population are compared with the correlations 
found by Berger (1), and Block and Thomas 
(5) in Table 1. In general the results are re- 
markably similar. They are similar in spite of 
the differences in instruments used and in 
types of population. Berger used an inventory 
to measure self-acceptance, Block and Thomas 
used Q sorts of self and ideal, and we used 
adjective rating scales for self and ideal. 
Berger, and Block and Thomas used college 
students while we used psychiatric patients, 
who presumably cover a wider range of the 
adjustment continuum. The positive correla- 
tion between self-acceptance and the K scale 
is obtained in all three studies. The negative 
correlations between self-acceptance and the 
D, Pt, and Sc scales are also found in all 
three studies. The negative correlation be- 
tween self-acceptance and the Hs scale is ob- 
tained in the three studies with the exception 
of the college men in Berger’s study. The 
negative correlation between self-acceptance 
and the F and Si scales were found in the 
two studies where they were measured. The 
negative correlation between self-acceptance 
and the Pa scale was found only in Berger’s 
and this study. The positive correlation with 
Hy was found only in Berger’s male Ss, and 
the negative correlation with Pd was found in 
Berger’s female Ss, and Block and Thomas’ 
male and female Ss. 

The present results replicate six out of the 
six relationships found between self-accept- 
ance and MMPI scales in both sexes in the 
Berger study, and six out of seven of the re- 
lationships found in the Block and Thomas 
study. 

Another way to analyze the MMPI data, 
which is closer to usual clinical practice, is to 
analyze the peaks in the profile disregarding 
the height of the profile. Table 2 shows the 
percentages of either the low self-accepting or 
high self-accepting groups (defined as those 
above and below the median) having any one 
of the MMPI scales as the first or second 
highest scale in the profile. The low self-ac- 
cepting group peaks significantly more often 
on the D and P? scales. The high self-accept- 








rs ed 


naa 


Ps) PO: SOREN PEE 











LO AE LTO ETE IIE HF ot 


ROI + 


Renan + 


eS 





Self Acceptance and Psychopathology 147 


Table 2 


Percentages of Low and High Self-Accepters 
Peaking t on MMPI Scales 














Scale Low High CR 
Hs 14.3 13.0 12 
D 52.4 22.7 2.01* 
Hy 9.5 18.2 59 
Pd 28.6 54.5 1.73 
Pa 9.5 13.0 36 
Pi 42.9 9.1 2.54* 
Se 52.4 36.4 1.06 
Ma 0.0 31.8 2.84* 
* Significant at or below the .05 level. 
t ‘‘Peaking” is defined by a subject having the particular 
scale as the first or second highest scale in his profile. gj ., 


ing group peaks significantly more often on 
the Ma scale and shows a tendency, short of 
significance, to peak more often on the Pd 
scale. 

The final question in our study is the re- 
lationship between self-acceptance and ad- 
justment as defined by an external criterion, 
the case history rating. The correlation be- 
tween self-acceptance and the case history 
rating was — .06, clearly not significantly 
different from zero. The plotted data reveal 
neither a rectilinear nor a curvilinear rela- 
tionship. 


Discussion 


Why should a tendency to be satisfied with 
oneself be related to low scores on MMPI 
scales? Rosen (9) has found that the rela- 
tionship between self-indorsement of MMPI 
items and personal desirability of these items 
is .87 for both sexes. The scales which are 
most consistent in their negative correlations 
with self-acceptance (D, Pt, Sc, Si) are pre- 
cisely those scales most influenced by desir- 
ability of the items in Rosen’s study. The K 
scale, which is positively correlated with self- 
acceptance, is also correlated with the tend- 
ency ‘o answer MMPI items in the direction 
of desirability. The person who is self-satis- 
fied is likely to answer MMPI items in a way 
which he considers personally and socially de- 
sirable. Thus, both self-acceptance and MMPI 
scales are probably being influenced more by 
the common trait of defensiveness than by 
actual adjustment. In the present study 6 of 
the 9 paranoid schizophrenics were in the high 


self-accepting third of the distribution. De- 
spite this fact, the Pa scale correlated nega- 
tively with self-acceptance. Apparently if a 
patient is highly self-accepting, he can evade 
detection on a scale designed specifically to 
reveal his traits. Other patients who are not 
paranoid score high on the Pa scale merely 
because of a readiness to admit negative 
traits in themselves. 

Our self-dissatisfied patients tend to score 
higher on the D scale and peak more often 
on this scale than the self-satisfied patients. 
In the MMPI manual, a high D score is said 
to indicate “poor morale of the emotionai 
type with a feeling of uselessness and inabil- 
ity to assume a normal optimism with regard 
to the future... lack of self-confidence 
tendency to worry, narrowness of interests 
and introversion” (6, p. 19). The correlation 
between these traits and lack of self-accept- 
ance is not surprising. Bills found that people 
with low self-acceptance show depressive signs 
on the Rorschach (3), and tend to internalize 
blame for their problems (4). Our low self- 
accepters tend to be higher and peak more 
frequently on the Pt scale. The manual de- 
scribes the Pt scale as indicating phobias, 
compulsive behavior, mild depression and 
lack of confidence. Interpreting the other cor 
relations found in Table 1, it seems that low 
self-accepting patients tend to describe them- 
selves as concerned about bodily functions 
(Hs), suspicious and oversensitive (Pa), pos- 
sessing bizarre or unusual thoughts (Sc) and 
introverted (Si). 

The high self-accepting patients peak more 
frequently than low self-accepters on the Ma 
scale. Some characteristics of people scoring 
high on this scale are “active . . . enthusi- 
astic . . . disregard of social conventions.” 
Actually the most frequent peak in the high 
self-accepting group is the Pd scale. This 
scale is related to “absence of deep emotional 
response, inability to profit from experience 
and disregard of social mores.” 

The contrast between the types of person- 
alities descriptive of low and high self-ac- 
cepters is marked. The high self-accepter is 
defensive, tends to act out his problems and 
externalizes blame; while the low self-ac- 
cepter internalizes, socially withdraws, and 
suffers from depression and doubt. These find- 





148 


ings are consistent with Bills’s results (2) on 
the differences in the Rorschachs given by 
low and high self-accepters. 

Although self-acceptance probably reflects 
differences in modes of handling personal 
maladjustment self-acceptance is not related 
to actual adjustment in patients as seen by 
others. Self-acceptance might conceivably 
bear more of a relationship to actual adjust- 
ment in outpatients who come in voluntarily 
to receive psychotherapy. Perhaps in this type 
of group defensive patterns are less common. 
However, it is apparent that self-acceptance 
should not be used as the sole criterion for 
improvement in psychotherapy. In fact, any 
questionnaire type of test which does not con- 
trol for item desirability should not be used 
as a sole criterion since it is probably sub- 
ject to the same defensive effects that affect 
self-ratings. 


Summary 


This study was undertaken to see if rela- 
tionships between self-acceptance and MMPI 
scales found in college students could be 
replicated in patients; and to see if these re- 
lationships were due to a real relationship 
between self-acceptance and adjustment. 

The subjects were 43 psychiatric patients. 
They rated their self and ideal concepts on 
adjective scales. The discrepancy between rat- 
ings on these two concepts was used as a 
measure of self-acceptance. All patients had 
taken the MMPI. Ratings of adjustment were 
made on the basis of the final case summary 
on the patients. 

Significant negative 


relationships were 


found between self-acceptance and the F, Hs, 
D, Pa, Pt, Sc, and Si scales of the MMPI. 
A significant positive relationship was found 
between self-acceptance and the K scale. 





Marvin Zuckerman and Irwin Monashkin 


Most of these relationships replicated the re- 
sults found in college students. Using a pro- 
file analysis approach, the low self-accepters 
were found to have D and Pt as their first or 
second highest scores significantly more fre- 
quently than high self-accepters. High self- 
accepters were found to peak significantly 
more frequently on Ma. There was no rela- 
tionship between self-acceptance and adjust- 
ment as measured by the case-history rating. 

The results were interpreted in the light of 
the influence of personal and social desir- 
ability on MMPI items, and different modes 
of handling problems in low and high self- 
accepters. 


Received May 28, 1956. 


References 


1. Berger, E. M. Relationships among acceptance of 
self, acceptance of others, and MMPI scores. 
J. counsel. Psychol., 1955, 2, 279-284. 

2. Bills, R. E. Rorschach characteristics of persons 
scoring high and low in acceptance of self. J. 
consult. Psychol., 1953, 17, 36-38. 

3. Bills, R. E. Self concepts and Rorschach signs of 
depression. J consult. Psychol., 1954, 18, 135- 
137. 

4. Bills, R. E., Vance, E. L., & McLean, O. S. An 
index of adjustment and values. J. consult. 
Psychol., 1951, 15, 257-261. 

5. Block, J., & Thomas, H. Is satisfaction with self 
a measure of adjustment? J. abnorm. soc. 
Psychol., 1955, 51, 254-259. 

6. Hathaway, S. R., & McKinley, J. C. Minnesota 
Multiphasic Personality Inventory, Manual. 
New York: Psychological Corp., 1951. 

7. Rogers, C. R. The case of Mrs. Oak—a research 
analysis. Psychol. Serv. Center J., 1951, 3, 47- 
165. 

8. Rogers, C. R., & Dymond, Rosalind F. Psycho- 
therapy and personality change. Chicago: Uni- 
ver. of Chicago Press, 1954. 

9. Rosen, E. Self-appraisal and perceived desirability 
of MMPI personality traits. J. counsel. Psy- 
chol., 1956, 3, 44-51. 

















Journal o Commies Psychology 
Vol. 21, No. 2, 195 








Anecdotal evidence lends support to the 
: view that patients who are acutely disturbed, 

showing marked signs of confusion and dis- 
orientation, tend to have a more favorable 
prognosis than patients whose symptoms are 
more insidious and less dramatic. Chase and 
Silverman (2) point out that, among other 
indices, acute onset and confusion are favor- 
able prognostic signs. Albee points out that 
“psychiatrists with long and intimate experi- 
ence with schizophrenics are all too familiar 
with the patient whose relatively mild overt 
symptomatology defies every therapeutic ef- 
fort, while another patient whose symptoms 
appear to be indicative of total personality 
disorganization achieves a sudden and unex- 
pected remission” (1, p. 208). Reviews of 
prognostic signs in schizophrenia are also dis- 
cussed by Mayer-Gross and Moore (6), 
Lewis (4), Strecker (7), and Malamud and 
Render (5). There appears to be little doubt 
that marked changes frequently occur with 
severely disturbed patients, but there is no 
systematic investigation which attempts to 
assess whether this change occurs more fre- 
quently than with patients who display greater 
ego-intactness and less extreme symptomatol- 
ogy. The purpose of the present study is to 
evaluate the prognostic significance of extreme 
symptomatology. 

Disorientation was selected as the inde- 
pendent variable, because it is traditionally 
employed in psychiatric description. It also 
represents an objective measure of lack of 
ego-intactness, and it correlates highly with 
confusion (if not being identical with it). Be- 
cause improvement is always a difficult vari- 
able to measure, we reasoned that it would 


1The authors wish to acknowledge the contribu- 
tion of Dr. George Albee in originally suggesting the 
area of investigation. 








149 





Disorientation as a Prognostic Criterion’ 


A. Eskey, Gladys Miller Friedman, and Ira Friedman 


Cleveland Receiving Hospital and State Institute of Psychiatry 


be valid to select patients discharged as im- 
proved, and the length of hospital stay would 
tell us how rapidly they improved, i.e., prog- 
nosis with reference to duration of illness 
rather than to degree of improvement. The 
study, specifically, is a comparison between 
an oriented and disoriented group in terms 
of length of hospital stay. 


Method 


One hundred psychotic patients exhibiting 
varying degrees and types of disorientation 
upon admission to Cleveland Receiving Hos- 
pital and State Institute of Psychiatry were 
selected at random from reports of mental 
status examinations. A control group of 100 
patients who were well oriented upon admis- 
sion was in like manner chosen. Only first ad- 
missions coming to the hospital during the 
years 1952 through 1955 were used. The age 
range was 16 to 59 inclusive, with all pa- 
tients being eliminated from the study who 
remained in the hospital less than 14 days. 
In order to control factors which were felt to 
influence length of hospitalization, the fol- 
lowing types of cases were rejected: patients 
with a diagnosis of acute or chronic brain syn- 
drome, alcoholics, mental defectives, chronic 
patients who were transferred to long-term 
treatment centers, those who were not im- 
proved upon discharge, and those who were 
released against medical advice. In brief, our 
two groups were composed of selected psy- 
chotic patients who stayed in the hospital 
more than two weeks and who were dis- 
charged from the hospital as improved or re- 
covered. 

The groups were matched on the basis of 
age, sex, and psychiatric diagnosis. In match- 
ing, age distributions were separated into 
five-year intervals. If, for example, there 


150 A. Eskey, Gladys Miller Friedman, and Ira Friedman 


were four females and one male in the dis- 
oriented group age 30 through 34, a match- 
ing sample was obtained for the correspond- 
ing age group among the oriented patients. 
Other factors such as marital status, race, 
and type of treatment were approximately 
equal between groups. Accuracy of matching 
is apparent from Table 1. 

Disorientation in the spheres of time, place, 
and person was selected as an indication of 
confusion. In some instances the patients 
were disoriented in only one area. In a few 


Table 1 


Factors Showing Accuracy of Matching 
Between Groups 


Criterion Oriented Disoriented 
Mean Age 35.8 35.4 
Sex 
Male 35 35 
Female 65 65 
Race 
White 74 66 
Negro 26 34 
Marital Status 
Married 48 53 
Single 34 29 
Separated 3 10 
Divorced 10 6 
Widow 5 1 
Widower 0 1 
Clinical Groups 
Schizophrenia 
Paranoid 44 44 
Catatonic 20 20 
Simple 4 4 
Hebephrenic 1 1 
Chronic Undifferentiated 7 7 
Acute Undifferentiated 2 2 
Schizo-Affective 2 2 
Manic-Depressive 
Manic 2 2 
Depressed 5 5 
Involutional Psychosis 13 13 
Treatment 
ECT 63 81 
ECT and Insulin 9 
Insulin 1 1 
Other 27 14 





cases there was some question as to whether 
the patient was actually disoriented. In this 
latter instance three psychologists examined 
the records, and only those cases where there 
was complete agreement were included. Ac- 
cordingly, patients who knew the date within 
four or five days, for example, were not in- 
cluded in the disoriented group. The differ- 
ence in length of hospitalization for the ori- 
ented groups was subjected to statistical 
analysis. 


Results 


When the total number of patients were ar- 
ranged in frequency tables based on 30-day 
intervals of hospitalization, it became appar- 
ent that neither group was normally dis- 
tributed. Both distributions were skewed in 
that they grouped toward the low end of the 
scale, i.e., the great majority of discharges 
occurred within a 120-day period. At the op- 
posite end of the distributions (greater num- 
ber of days of hospitalization), it was found 
that there were more disoriented than ori- 
ented patients. Because of these few extreme 
cases, the median was used as a basis for com- 
parison between the groups, as it is less af- 
fected than the mean by extreme scores. The 
results showed that the oriented group had a 
median of 73.9 days of hospitalization with 
an SE of 6.87, while the disoriented group 
had a median of 85.4 with an SE of 9.55. 
Figure 1 shows the distributions. 

A comparison was also made between the 
two groups containing the largest number of 


~ TORIENTED ON ADMISSION ————————| 
So 6 ee 








NUMBER OF PATIENTS 


6° 3% © 9 BO 60 BO 20 20 270 0 


TIME SPENT INHOSPITAL (IN DAYS) 


Fig. 1. Length of hospitalization for oriented and 
disoriented groups. 





a ali 





aE! aS oe 


aot 











cases with respect to specific kinds of disori- 
entation. These groups were patients disori- 
ented for time alone and those who were dis- 
oriented for both time and place. In other 
words, we wanted to compare a group of pa- 
tients with greater confusion as reflected by 
disorientation for both time and place with 
a group disoriented for time alone. Those 
disoriented for time had a median of 83.2 
days of hospitalization with an SE of 11.17 
and the time-place disoriented group had a 
median of 89.0 days with an SE of 17.66. 
There were too few patients with disorienta- 
tion for person to include this factor in the 
comparisons which were made. 

When the reliability of the difference be- 
tween the medians of the oriented and dis- 
oriented groups was determined, a ¢ value of 
1.02 (p = .30) was obtained. Thus when the 
most appropriate statistical measures were 
employed (comparison based on medians), 
the results indicate no significant difference 
between the groups. 

When the comparison was made of the 
group disoriented for time alone with the 
one disoriented for both time and place, the 
resulting ¢ value of .27 (p= .80) shows no 
significant difference between the groups. 


Discussion 


The results indicate that patients who are 
disoriented do not improve more rapidly than 
patients with a greater degree of orientation— 
a finding contrary to current thinking. It may 
well be that when an acutely disturbed pa- 
tient recovers rapidly, the change is quite 
dramatic. On the other hand, the change in 
patients who do not display such intense be- 
havioral, overt symptomatology is much less 
striking. According to Gestalt principles of 
remembering (3) we are more prone to be 
impressed by cases where the end result, as 
compared to the initial picture of the pa- 
tient, is characterized by contrast rather than 
similarity. The results lend support to the 
hypothesis that our anecdotal reporting may 
be subject to unwarranted overgeneralization. 

It is also possible that the variable of dis- 
orientation is not sufficiently representative of 
“acute onset . . . confusion . . . and atypi- 
cal symptomatology” (2) to yield consistent 
results in the anticipated direction. It is sug- 





Disorientation as a Prognostic Criterion 151 


gested that further research may prove fruit- 
ful, taking additional variables into account. 


Summary 


The study is an investigation of the prog- 
nostic significance of disorientation in terms 
of the rapidity of improvement. One hundred 
disoriented patients were matched with 100 
oriented patients on the basis of age, sex, and 
psychiatric diagnosis. Precautions were taken 
to eliminate those patients from the study 
where certain nonrelevant factors might have 
operated to influence the length of their resi- 
dence in the hospital. Results indicate that 
there is no significant difference between 
groups with reference to length of time they 
remained in the hospital. No significant dif- 
ference was shown between patients disori- 
ented for time alone and those disoriented 
for both time and place. The results were ex- 
plained in terms of Gestalt principles of simi- 
larity and contrast, i., that we are more 
prone to be impressed by dramatic improve- 
ments where the patient is strikingly differ- 
ent from his initial disturbed state. While 
this study suggests our anecdotal reporting 
may result in unwarranted overgeneraliza- 
tion, the possibility that disorientation is 
not sufficiently representative of “acute onset 

. confusion and . . . atypical symptoms” 
was pointed out. Further research is recom- 
mended. 


Received July 2, 1956. 


References 


1. Albee, G. W. The prognostic importance of delu- 
sions in schizophrenia. J. abnorm. soc. Psy- 
chol., 1951, 46, 208-212. 

. Chase, L. S., & Silverman, S. Prognostic criteria 
in schizophrenia. Amer. J. Psychiat., 1941-42, 
98, 360-368. 

3. Koffka, K. Principles of Gestalt psychology. New 

York: Harcourt, Brace, 1935. 

4. Lewis, N. D. Prognostic significance of certain 
factors in schizophrenia. Arch. Neurol. Psy- 
chiat., 1945, 53, 241-242. 

5. Malamud, W., & Render, N. Course and prog- 
nosis in schizophrenia. Amer. J. Psychiat. 
1939, 95, 1039-1057. 

6. Mayer-Gross, W., & Moore, N. P. Schizophrenia. 
J. ment. Sci., 1944, 90, 241-255. 

7. Strecker, E. Prognosis in schizophrenia. Res. Publ. 
Ass. nerv. ment. Dis., 1931, 10, 119-190. 


tw 































Journal of Consulting Psychology 
Vol. 21, No, 2, 1957 





On the Relation Between A-Scale Scores and 
Digit Symbol Performance 


Leonard D. Goodstein and I. E. Farber 


State University of lowa 


In a number of recent studies, Matarazzo 
and his colleagues (3, 4, 5, 6) have been con- 
cerned with the possibility that the function 
relating scores on the Taylor A scale and per- 
formance in various other tasks involves a 
reversal of sign in the slope. Specifically, they 
have assumed that the maximum appears in 
the middle ranges of anxiety, with lower 
scores at the extremes. As Farber and Spence 
(1) have noted, this sort of relation might oc- 
casionally be expected on theoretical grounds. 
But, despite notably persistent attempts to 
verify this hypothesis, the proponents of this 
view have found scant evidence for it in their 
own studies. Moreover, contrary to their sup- 
position (3, 6) no evidence of such a relation 
has been found in conditioning experiments. 
Spence and Taylor (7) have stated that Ss 
in the middle ranges of anxiety sometimes 
perform more like nonanxious than like anx- 
ious Ss, indicating that the relation between 
A-scale scores and conditioning performance 
may be curvilinear, i.e., nonlinear. But this is 
quite different from the notion that the rela- 
tion is nonmonotonic, with a reversal of sign 
in slope. There has apparently been some 
confusion concerning the nature of curvi- 
linear and nonmonotonic functions. 

The question remains, of course, whether 
in any specific situation there is any relation 
between anxiety and performance, and, if so, 
whether it is linear, curvilinear but mono- 
tonic, or curvilinear and nonmonotonic with 
a maximum at an intermediate level of anx- 
iety. Thus, Matarazzo and Phillips (3) have 
reported findings for a Digit Symbol task 
that might be interpreted in favor of a curvi- 
linear relation, in that ¢ tests showed that 
Ss at a low level of anxiety performed more 


poorly (/ < .05) than those at two inter- 
mediate levels. There-was no evidence that 
the relation was not monotonic, since the 
most anxious Ss did not differ significantly 
from those at the intermediate levels. How- 
ever, as these writers conjectured, it is pos- 
sible that this negative result was due to 
their inability to use a more extreme group at 
the high end of the scale because of the small 
N involved.' 

The present study presents the results of 
an attempt to check the foregoing results 
and hypotheses. Following the procedure of 
Matarazzo and Phillips, a 175-item form of 
the Wechsler-Bellevue Digit Symbol test and 
the Taylor A scale were administered to 409 
college underclassmen, 205 men and 204 
women. 

Results 


The Digit Symbol performances of the men 
and women at six levels of anxiety are pre- 
sented in Table 1. The class intervals defin- 
ing the levels were the same as those used by 
Matarazzo and Phillips, except that a more 
extreme group was added at the high end, in 
accordance with their suggestion. The results 
obtained in the earlier investigation are also 
presented, for purposes of comparison. 


1 The distribution of A-scale scores obtained in the 
study by Matarazzo and Phillips differed markedly 
from that of the Iowa standardization population 
(8). On the basis of the latter scores, over 20% of 
the group might have been expected to have scores 
of 21 or higher. A x test of the goodness of fit of 
their frequencies to the expected frequencies at the 
five levels of anxiety yields a value of 20.9, which, 
for 2 df, is significant at p< .001. The medical stu- 
dents who comprised the sample in the former study 
were apparently less anxious than introductory psy- 
chology students at Iowa. 


152 

















A-Scale Scores and Digit Symbol Performance 


Table 1 
Digit Symbol Scores as a Function of Anxiety Level 


Present study 





Men 

A-Scale ee 

Level interval N M SD 
1 0-5 15 106.0 17.5 
2 6-10 52 109.8 18.3 
3 11-15 61 106.5 14.9 
4 16-20 43 105.4 18.2 
5 21-25 19 101.5 15.7 
6 26-40 15 112.1 14.4 
Total 205 107.0 18.3 


Since the women’s mean performance was 
significantly better than that of the men (¢ 
= 4.6, p < .001), and that for the men more 
closely resembled the results obtained by 
Matarazzo and Phillips, separate analyses of 
variance were applied to the data for the two 
sexes. In neither instance did the differences 
among levels approach significance. For the 
men, the value of F was less than unity, while 
for the women the F value was 1.46, which 
for 5 and 199 df is not significant at an ac- 
ceptable level of confidence (.20 < p < .30). 
There was thus no reason to suppose that 
any relation, curvilinear, nonmonotonic, or 
any other kind obtained between anxiéty and 
performance in the Digit Symbol task. 

Despite our grave doubts concerning the 
legitimacy of comparing individual pairs of 
means, in view of these results, it might be 
noted in Table 1 that, for the men, the per- 
formance of the least anxious Ss (Level 1) 
was actually superior to that of one inter- 
mediate group (Level 4), and that the per- 
formance of the most anxious Ss (Level 6), 
far from being inferior to that of the inter- 
mediate groups, was best of all. However, in 
not a single instance was a ¢ between means 
at various levels significant. In respect to the 
women, the least anxious Ss (Level 1) did 
not differ significantly from Ss at any other 
level. Use of the ¢ test did reveal significant 
differences between the most anxious Ss and 
those at two intermediate levels. Using an un- 
biased estimate of the error variance of the 
difference between means (2, p. 91), # be- 
tween Levels 6 and 4 was 2.28 and that be- 









Matarazzo & 


Women Phillips (3) 
N M SD N M 
14 110.9 14.5 18 101.8 
36 116.4 19.9 41 109.7 
57 117.8 20.6 34 109.8 
46 118.5 17.8 15 108.8 
26 113.3 18.5 
25 1076 163 11 1004 
204 115.4 19.1 119 108.1 
tween Levels 6 and 3 was 2.20. Assuming 


(incorrectly, as we believe) that ¢ tests are 
permissible under these circumstances, both 
values are significant, p < .05. 

Similar analyses based on only the five lev- 
els used by Matarazzo and Phillips yielded 
results that were completely congruent with 
those derived from the analysis for six levels. 


Discussion 


It may be seen that these results offer little 
support to the view that any relation, non- 
monotonic or other, exists between A-scale 
scores and Digit Symbol performance. The 
specific hypotheses of Matarazzo and Phillips 
that the mean at Level 1 would be signifi- 
cantly poorer than the means at each level 
from 2 to 5 were not substantiated in a single 
instance. The specific hypothesis that Level 3 
would be superior to 6 was substantiated in 
the case of women, but the results were in 
the opposite direction for men. Taken in con- 
junction with their own findings (3), not a 
single suggested relation between individual 
levels is unequivocally borne out. Conse- 
quently, the argument that despite the non- 
significance of an over-all test, simple tests of 
significance between levels are permissible on 
the basis of previous empirical results in this 
area must be fiatly rejected in the future. Of 
course this does not preclude tests of simple 
effects based on theoretical expectations. But, 
in the latter case, it would be necessary for 
the theory to predict the locus of the maxi- 
mum much more precisely than has yet been 
done. Otherwise, there is no guarantee that 





154 Leonard D. Goodstein and I. E. Farber 


inadvertent advantage might not be taken of 
chance differences among levels. 


Summary 


The present study was concerned with the 
hypothesis that the relation between Taylor 
A-scale scores and performance on a Digit 
Symbol task is nonmonotonic, with Ss in the 
middle ranges of the A-scale distribution per- 
forming better than those at the extremes. 
With Ss classified at six levels of anxiety, no 
consistent evidence was obtained to support 
this hypothesis or any more general hypothe- 
sis concerning a relation between A-scale and 
Digit Symbol scores. 


Received July 17, 1956. 


References 


1, Farber, I. E., & Spence, K. W. Effects of anxiety, 
stress, and task variables on reaction time. J. 
Pers., 1956, 25, 1-18. 


2, 


8. 


Lindquist, E. F. Design and analysis of experi- 
ments in psychology and education. Boston: 
Houghton Mifflin, 1953. 


. Matarazzo, J. D., & Phillips, Jeanne S. Digit 


symbol performance as a function of increas- 
ing levels of anxiety. J. consult. Psychol., 1955, 
19, 131-134 


. Matarazzo, J. D., Ulett, G. A., Guze, S. B., & 


Saslow, G. The relationship between anxiety 
level and several measures of intelligence. J 
consult. Psychol., 1954, 18, 201-205. 


. Matarazzo, J. D., Ulett, G. A., & Saslow, G. Hu- 


man maze performance as a function of in- 
creasing levels of anxiety. J. gen. Psychol., 
1955, 53, 79-95. 


. Matarazzo, Ruth G., & Matarazzo, J. G. Anxiety 


level and pursuit-meter performance. J. con- 
sult. Psychol., 1956, 20, 70. 


. Spence, K. W., & Taylor, Janet A. The relation 


of conditioned response strength to anxiety 
in normal, neurotic, and psychotic subjects. 
J. exp. Psychol., 1953, 45, 265-272. 

Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol., 1953, 48, 
285-290. 














ournal of Consulting P. 
Var rh Oo 2 ‘if wane 


, 195 









Some Social and Cultural Factors Determining 
Relations Between Authoritarianism and 





Measures of Neuroticism' 


Anthony Davids 


Brown University and Bradley Home 


and Charles W. Eriksen 


University of Illinois 


In a recent evaluation of research on the 
authoritarian personality, Masling (18) criti- 
cized the tendency for value judgments to be- 
come confounded with science in this research 
area. According to him all “bad” personality 
characteristics tend to be ascribed to authori- 
tarians without, in many instances, substanti- 
ating evidence. Masling feels that the writing 
on the authoritarian personality has implied 
a positive relation between neuroticism and 
authoritarianism. He refutes such a relation- 
ship by summarizing four empirical studies 
that failed to find evidence of a relation be- 
tween authoritarian ideology and measures of 
neuroticism and maladjustment. 

While some of Masling’s points are well 
taken, the studies he summarizes do not suc- 
ceed in answering the question as to the re- 
lation between authoritarian personality and 
neurotic tendencies. For example, Davids (4) 
has recently reported highly significant cor- 


1 The authors wish to thank Capt. G. J. Duffner, 
Capt. J. L. Kinsey, Lt. H. B. Murphree, and others 
on the staff of the Medical Research Laboratory at 
the New London Submarine Base, for making it pos- 
sible to obtain the data used in this study. We wish 
to emphasize, however, that this experiment was in 
no way sponsored by or officially affiliated with the 
Naval service. The opinions, assertions, or conclu- 
sions contained in this report are those of the authors 
and are not to be construed as reflecting the views 
or indorsement of the Navy Department. This study 
was facilitated by a research grant from the Harvard 
Laboratory of Social Relations and a grant from the 
National Institute of Mental Health, Public Health 
Service. 





155 


relations between F-scale scores and measures 
of maladjustment derived from the Manifest 
Anxiety Scale and clinical evaluations by an 
experienced psychoanalyst. Also Freedman, 
Webster, and Sanford (9) have recently re- 
ported significant correlations between the F 
scale and the hysteria and psychasthenia 
scales from the MMPI. It was the purpose of 
the present study to investigate some of the 
factors that may be contributing to these con- 
flicting results. 

In examining the discrepancies between the 
findings of Davids and of Freedman, Webster, 
and Sanford and those reported by Masling, 
one of the more obvious differences between 
these studies is in the populations from which 
the Ss were drawn. Davids’ sample as well as 
that of Freedman et al. consisted of university 
undergraduates, while three of the four stud- 
ies summarized by Masling employed Ss se- 
lected from nonstudent populations. In one 
case the sample was drawn from patients in 
a psychiatric clinic, while in another the sam- 
ple was selected randomly on the basis of 
census tracts in a large city. In the third 
study the sample consisted of Naval recruits. 
Even in the fourth study the sample was com- 
posed of university summer-school students 
who are not likely to be representative of un- 
dergraduates. These differences in population 
samples suggested to us that the differences 
in findings might well be due to some com- 
plex relation between authoritarianism, neu- 
roticism, and certain sociocultural variables. 


156 Anthony Davids and Charles W. Eriksen 


In order to examine this possibility, we have 
repeated Davids’ study, using a sample of 
Naval enlisted men. Contrary to Davids’ find- 
ings with university students, we expected the 
Navy Ss to yield negative results similar to 
those reported by Masling. 

In his study, Davids failed to find a sig- 
nificant relation between authoritarian ideol- 
ogy and intolerance of ambiguous auditory 
stimuli. Since the concepts of “rigidity” and 
“intolerance of ambiguity” play a central role 
in the theory of authoritarian personality (11, 
12), we wished to determine if Davids’ failure 
to find the predicted relation might have been 
due to the sample studied. Accordingly, in the 
present study, we administered the auditory 
projective test (7) to the Naval trainees, and 
again investigated relations between authori- 
tarianism and reactions to ambiguity.” 


Method 


Subjects. Forty-eight Naval enlisted men 
served as Ss in the present experiment. These 
young men, the majority of whom were of 
college age although very few had attended 
college, were undergoing training at the New 
London Submarine Base. The assessment 
measures used in this study were included in 
the battery of procedures that is administered 
routinely to all trainees upon their arrival at 
the submarine school. 


Authoritarianism. The Ss were administered the 
30-item California F scale (1). Total scores ranged 
from a low of 95 to a high of 172, with a mean of 
134.3. The mean score per item was 4.48. 

Intelligence. Performance on the standard Naval 
General Classification Test (GCT) was used as a 
measure of intelligence. Conventionally, scores on 
this test are converted to T scores, forming a dis- 
tribution with a mean of 50 and standard deviation 
of 10. In the present sample, GCT scores ranged from 
33 to 67, and the mean score was 56. 

Manifest anxiety. The Taylor scale of manifest 
anxiety (22) was administered to the Ss. Scores 
ranged from 2 to 22, with a mean of 8.9. 

Psychosomatic Inventory. This inventory, con-- 


2Since the F scale (authoritarianism) and the E 
scale (ethnocentrism) have been shown to correlate 
77 (1), they are frequently used interchangeably 
and results are often generalized from one scale to 
the other. Also, the concepts of “rigidity” and “in- 
tolerance of ambiguity” have been used interchange- 
ably (11, 12), and both have been applied to high 
authoritarians. In the present report we will not at- 
tempt to differentiate between these concepts. 


structed by McFarland and Seitz, is designed to pro- 
vide a measure of neuroticism (17). High scores in- 
dicate normality and low scores indicate neuroticism. 
Scores for the present Ss ranged from 91 to 383, 
with a mean of 292.7. 

Reactions to ambiguous auditory stimuli. The Az- 
zageddi Test was administered; this test is an audi- 
tory projective technique consisting of passages of 
spoken communication containing contradictory and 
irreconcilable statements and ideas (4, 5, 7). Al- 
though each statement, by itself, is meaningful and 
coherent, when several statements are intermingled 
in a passage of speech, there is much confusion and 
contradiction inherent in the passage. Consequently, 
the S is confronted with confusing and contradic- 
tory ideas and asked to recall as many of the ‘deas 
as he can from the passage. The total number of 
phrases and statements in the test is 112. The num- 
ber of items recalled by the present Ss ranged from 
a low of 12 to a high of 60, with a mean recall score 
of 33. After hearing, and recalling, the eight pas- 
sages which constitute the test, the Ss were pre- 
sented with sheets on which they could indicate 
their personal reactions to the auditory projective 
test. On 6-point rating scales, they indicated the 
degree of ambiguity they perceived in the spoken 
material, and the degree of satisfaction (liking) or 
dissatisfaction (disliking) they experienced while at- 
tempting to cope with the task. 


Results 


Table 1 presents the product-moment cor- 
relations among the various experimental 
measures. Since these findings will be com- 
pared with those obtained by Davids with 
university Ss, we will first review briefly the 
significant findings from that investigation 
(4). It was found that the F scale correlated 
.69 with manifest anxiety, and — .57 with 
scores on the Psychosomatic Inventory. In 
both cases these coefficients, which are sig- 
nificant beyond the .01 level, indicate that 
the college students who were high on au- 
thoritarianism tended to score relatively high 
on measures of neuroticism. Also, students 
who were high on authoritarianism had lower 
grade-point averages (— .40) although there 
was no significant relation between grade- 
point averages and neuroticism. 

It is evident from Table 1 that the pattern 
of intercorrelations obtained with the present 
sample of Naval trainees is quite different. 
Here, as predicted, there is no significant re- 
lation between F-scale scores and either meas- 
ure of neuroticism. Again, however, there is 4 
significant negative correlation between the F 
scale and intelligence which, for this sample, 





Oo et eS - 2.3 welUceMlC ClUCUCMlCUllCUe 





er 


Relations Between Authoritarianism and Neuroticism 157 


Table 1 


Product-Moment Intercorrelations Among Experimental Measures 


(N = 48) 
Intelligence Auditory Manifest P-S 
Measure (GCT) test anxiety inventory 
F scale 24° 10 }+- 02 12 
Intelligence (GCT) }+-.37** +. U8 1.12 
Auditory test 1 O8 L. O4 
Manifest anxiety 26" 


* Significant at the .05 level for a one-tailed test 
** Significant at the .01 level for a one-tailed test. 


is measured by GCT scores. This finding is in 
keeping with results reported by several previ- 
ous investigators who have found significant 
relations between authoritarianism and vari- 
ous measures of intelligence (1, 3, 4, 5, 13). 
It should also be noted that there is no evi- 
dence of a relation between manifest anxiety 
and intelligence in the present sample. This 
finding agrees with those reported in some 
studies (4, 5, 6, 20, 21), but is at variance 
with certain other results (14, 16, 19). 

As a further test of our predictions we com- 
puted the significance of the difference be- 
tween the correlations of the F scale and the 
neuroticism measures obtained from the uni- 
versity Ss and the Navy Ss. For both meas- 
ures of neuroticism the difference between the 
correlations obtained from the two samples is 
significant beyond the .05 level of confidence. 
Table 2 shows the means and variances ob- 
tained by the Navy Ss and the university Ss 
on the F scale, Manifest Anxiety Scale, and 
Psychosomatic Inventory. Statistical analyses 
of these findings show pronounced differences 


between the two groups. As we would expect, 
the Navy personnel, as a group, are much 
higher on authoritarianism and much lower 
on both measures of maladjustment." 

Let us now examine the relation between 
F-scale scores and the Ss’ performance in re 
sponse to the ambiguous auditory stimuli 
Taking as an indirect measure of ambiguity 
tolerance the number of ideas recalled by the - 
Ss, the correlation of .10 shown in Table 1 


indicates that scores on the F scale are not 


correlated significantly with intolerance of 
ambiguity. In order to make a more direct 


8 Since the Ss in this study were trainees in Sub 
marine School, they constitute a highly selected 
group. Lest it be argued that the selected nature of 
our sample may have mitigated against a relation 
ship between authoritarianism and neuroticism, it 
hould be noted that while the Naval Ss are lower 
op neuroticism scores than the college samples, they 
ace considerably higher, as a group, on the F scale 
These mean differences between the two samples and 
on the two dimensions is in itself evidence against a 
universal positive relationship between authoritarian 
ism and neuroticism 


Table 2 


Differences Between Naval Enlisted Men and University Students on Measures of 
Authoritarianism and Neuroticism 


Navy Ss 
(N = 48) 
Measure Mean Variance 
F scale 134.3 268.6 
Manifest anxiety scale 8.9 22.6 
P-S inventory 292.7 3583.4 





** Significant beyond the .01 level. 


University Ss 


(N = 20) 

Mean Variance P t 
89.0 354.6 1.32 9 44** 
20.8 64.3 2.85°* 6.20"* 

137.7 25730.9 7,.18** 4.20°* 





158 Anthony Davids and Charles W. Eriksen 


test of the relation between authoritarianism 
and ambiguity tolerance, the Ss were dichoto- 
mized into a high group and a low group on 
the basis of their F-scale scores, and this 
classification was related to personal reactions 
to the auditory test. Chi-square tests of asso- 
ciation, corrected for continuity, showed no 
significant relation between the F scale and 
the Ss’ ratings of the test material as either 
ambiguous or unambiguous (,* = .87), and 
no significant relation between the F scale 
and the Ss’ ratings of whether they liked or 
disliked the auditory test (x? = 1.02). Thus, 
neither the indirect measure of tolerance of 
ambiguous spoken passages, based on the 
number of ideas recalled, nor the direct meas- 
ures, based on self-ratings, were found to be 
associated with authoritarianism. In this re- 
spect, the present findings with Naval trainees 
duplicate the previous results obtained by 
Davids with university undergraduates. 


Discussion 


The above results show that Navy men are 
more authoritarian, yet less neurotic, than 
the university students. Even though egali- 
tarian attitudes characterize university un- 
dergraduate cultures, as a group, the students 
are less well adjusted than the Navy enlisted 
men who are high on authoritarianism, but 
do not seem to suffer from undue anxiety or 
neurotic symptoms. That is to say, in the 
university social setting where Ss tend to be 
low on authoritarianism, they tend to be rela- 
tively high on neuroticism and there is a posi- 
tive association between being high in both 
dimensions. In the social setting of a military 
installation, however, and probably in many 
other nonacademic environments, the stand- 
ard of reference on authoritarianism is quite 
high. Aud in such a setting there seems to be 
no relation between authoritarianism and neu- 
roticism (10, 18). 

The above results are consistent with the 
hypothesis that relations between authori- 
tarian ideology and neuroticism are deter- 
mined to a large degree by sociocultural fac- 
tors. There are several reasons for suspecting 
that sociocultural factors might influence this 
relationship. One is to be found in the gen- 
eral attitudes of liberalism and nonprejudice 
that characterize the subcultures of most uni- 





versities. Almost by definition the neurotic is 
less apt to adapt to and assimilate the atti- 
tudes and values of teachers and classmates 
in his social environment. His social percep- 
tions are likely to be less sensitive and real- 
istic. Moreover, aggressive and hostile tenden- 
cies may lead him to actively reject the group 
attitudes and values. Placed in an environ- 
ment where it is socially rewarding to express 
liberal democratic attitudes, the neurotic is 
less apt to assimilate than the so-called nor- 
mal individuals. In a social setting such as a 
Naval training station, however, in which 
there is no premium placed on liberalism 
and egalitarianism, it seems quite likely that 
neuroticism would not be a correlate of au- 
thoritarianism. 

Our failure to find a relationship between 
F-scale scores and several measures of toler- 
ance for ambiguity must take its place along 
with an increasing number of other investiga- 
tions that have failed to confirm or replicate 
previously reported research on authoritarian- 
ism (2, 4, 5, 8, 10, 15). 


Summary 


The main purpose of the study was the at- 
tempt to clarify some of the confusion and 
contradiction that is currently found in the 
research area concerned with authoritarian- 
ism and personal adjustment. The different 
results reported by Masling with nonuniver- 
sity Ss and by Davids with university Ss sug- 
gested that the sociocultural setting in which 
Ss are examined might well have a significant 
influence on relations between the variables 
of authoritarianism and neuroticism. Although 
a group of Naval enlisted men examined in 
the sociocultural setting of a military instal- 
lation were found to be higher on authori- 
tarianism than were a group of university Ss, 
they tend to be significantly lower on meas- 
ures of neuroticism. And with the military 
personnel there was no significant relation 
between authoritarianism and neuroticism, 
whereas with the university Ss, there was a 
significant positive association between these 
variables. 

Moreover, using the Naval enlisted trainees 
as Ss, there was no relation between authori- 
tarianism and intolerance of ambiguous audi- 
tory stimuli. This finding, which is contra- 


















— Es wae 











i 
t 
s 
] 








ee es 











dictory to prediction based on authoritarian 
personality theory, is in keeping with previ- 
ous negative findings with university Ss. It 
was found, however, that authoritarianism 
was significantly correlated with a measure of 
intelligence, a finding that fits well with re- 
sults of previous studies. Also, in the present 
study, manifest anxiety and intelligence were 
not found to be associated. 


Received January 8, 1957. 
Early Publication. 


References 


1. Adorno, T. W., Frenkel-Brunswik, Else, Levin- 
son, D. J., & Sanford, R. N. The authoritarian 
personality. New York: Harper, 1950. 

. Brown, R. W. A determinant of the relationship 
between rigidity and authoritarianism. J. ab- 
norm. soc. Psychol., 1953, 48, 469-476. 

3. Cohn, T. S. Is the F scale indirect? J. abnorm. 

soc. Psychol., 1952, 47, 732. 

4. Davids, A. Some personality and intellectual cor- 
relates of intolerance of ambiguity. J. abnorm. 
soc. Psychol., 1955, 51, 415-420. 

5. Davids, A. The influence of ego-involvement on 
relations between authoritarianism and intoler- 
ance of ambiguity. J. consult. Psychol., 1956, 
20, 179-184. 

6. Davids, A., & Eriksen, C. W. The relation of 
manifest anxiety to association productivity 
and intellectual attainment. J. consult. Psy- 
chol., 1955, 19, 219-222. 

7. Davids, A., & Murray, H. A. Preliminary ap- 
praisal of an auditory projective technique for 
studying personality and cognition. Amer. J. 
Orthopsychiat., 1955, 24, 543-554. 

8. Eriksen, C. W., & Eisenstein, D. Personality ri- 
gidity and the Rorschach. J. Pers., 1953, 21, 
386-391. 

9. Freedman, M., Webster, H., & Sanford, N. A 
study of authoritarianism and psychopathol- 

ogy. J. Psychol., 1956, 41, 315-322. 


Ne 





Relations Between Authoritarianism and Neuroticism 159 


10. French, Elizabeth G. Interrelation among some 
measures of rigidity under stress and non- 
stress conditions. J. abnorm. soc. Psychol, 
1955, 51, 114-118. 

11. Frenkel-Brunswik, Else. Intolerance of ambiguity 
as an emotional and perceptual personality 
variable. J. Pers., 1949, 18, 108-143. 

12. Frenkel-Brunswik, Else. Personality theory and 
perception. In R. R. Blake & G. V. Ramsey 
(Eds.), Perception: An approach to person- 
ality. New York: Ronald Press, 1951. Pp. 
356-419. 

13. Gough, H. G. Studies of social intolerance: I 
Some psychological and sociological correlates 
of anti-Semitism. J. soc. Psychol., 1951, 33, 
237-246. 

14. Grice, G. R. Discrimination reaction time as a 
function of anxiety and intelligence. J. ab- 
norm. soc. Psychol., 1955, 50, 71-74. 

15. Jones, E. E. Authoritarianism as a determinant 
of first-impression formation. J. Pers., 1954, 
23, 107-127. 

16. Kerrick, J. S. Some correlates of the Taylor 
Manifest Anxiety Scale. J. abnorm. soc. Psy- 
chol., 1955, 50, 75-77. 

17. McFarland, R. A., & Seitz, C. P. A_psycho- 
somatic inventory. J. appl. Psychol., 1938, 22, 
327-339. 

18. Masling, J. M. How neurotic is the authori- 
tarian? J. abnorm. soc. Psychol., 1954, 49, 
316-318. 

19. Matarazzo, J. D., Ulett, G. A., Guze, S. B., & 
Saslow, G. The relationship between anxiety 
level and several measures of intelligence. J 
consult. Psychol., 1954, 18, 201-205. 

20. Matarazzo, Ruth G., & Matarazzo, J. D. Anxiety 
level and pursuit-meter performance. J. con- 
sult. Psychol., 1956, 20, 70. 

21. Mayzner, M. S., Sersen, E., & Tresselt, M. E 
The Taylor manifest anxiety scale and intelli- 
gence. J. consult. Psychol., 1955, 19, 401-403. 

22. Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol., 1953, 48, 
285-290. 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 





Spoken and Written Vocabulary; Their Relation to a 
Standard Vocabulary Test, Intelligence and Anxiety’ 


Maurice W. Sullivan and Allen D. Calvin 
Hollins College 


A study was designed to investigate the re- 
lationship between various measures of vo- 
cabulary, intelligence, and anxiety. 

Method. The Ss were 40 female under- 
graduate students from Hollins College. Otis 
higher form intelligence tests and the Taylor 
A scale were administered in group form. On 
a different day, all Ss were asked to write an 
essay on success which had to be at least 300 
words long. Each S was individually inter- 
viewed in the psycholinguistics laboratory 
where she was asked to discuss five campus 
topics; e.g., should the college eliminate Satur- 
day classes. Conversations were tape recorded 
through a concealed microphone. Twenty Ss 
were interviewed by a professor (MWS), and 
the other twenty by a student. At the con- 
clusion of the interview each S was given the 
Wechsler vocabulary test. 

Analysis. The following scores were ob- 
tained for each student: (a) Otis IQ, (5) 
Wechsler vocabulary, (c) A scale, (d) num- 
ber of written words, (e) for every fourth 
written word the word frequency was obtained 
from the L list of the Thorndike-Lorge Teach- 
ers Word Book, (f) every fourth written word 
was analyzed, and a count was made of the 
number of words which fell outside of the M 


1An extended report of this study may be ob- 
tained without charge from Maurice W. Sullivan, 
Hollins College, Va., or for a fee from the American 
Documentation Institute. Order Document No. 5098, 
remitting $1.75 for microfilm or $2.50 for photo- 
copies. 


group (words in the M group occurred 1,000 
times or more on the Lorge magazine count) 
and the number was divided by the S’s total 
number of written words analyzed, (g) num- 
ber of oral words, (#) for every fourth oral 
word the word frequency was obtained from 
the Z list of the Thorndike-Lorge Teachers 
Word Book, (i) every fourth word was ana- 
lyzed, a count was made of the number of 
words which fell outside of the M group, and 
the number was divided by the S’s total num- 
ber of oral words analyzed, (j) a ratio of 
verbs to adjectives was obtained using every 
fourth oral word. 

Results. The oral word frequency, written 
word frequency, and verb to adjective ratio 
all had to be discarded because of lack of 
reliability. Intercorrelations of all reliable 
measures were computed, and the following 
significant correlations were obtained: vo- 
cabulary test and amount of written words 
37 (p< .05); Otis scores and vocabulary 
test .43 (p< .05); amount of non-M oral 
words and A-scale scores — .45 (p < .01). 

A Holzinger and Harman cluster analysis 
was computed. The variables, amount of 
written words, amount of written non-M 
words, amount of oral words, vocabulary test, 
and Otis scores, formed a cluster with a B 
coefficient of 1.86; reflected A-scale scores 
and amount of non-M oral words formed an- 
other cluster with the B coefficient of 5.29. 
Brief Report. 

Received December 5, 1956. 














a Fe FF 


—_—_rew Ff 


3 OO pea VY 


Ss ww 


Ns ABSS. 2hs BOT. 


ee? 


ae 


Se 





i a ee ee 











Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


A Cross-Cultural Comparison of the MMPI 


Ronald Taft 


University of Western Australia * 


The Minnesota Multiphasic Personality In- 
ventory is perhaps the most widely used per- 
sonality inventory in which the ultimate cri- 
terion for the selection of items was purely 
empirical. The validation was carried out 
“blindly” according to whether each item dis- 
criminated between a sample of psychotics of 
specified types and a sample of “normals.” 
The question arises in such cases whether the 
use of such a test can be generalized beyond 
the geographical and temporal limitations of 
the validating population (Minnesotans in the 
middle 1930’s). 

Strictly speaking, this question can only be 
fully answered by replicating the validation 
procedure over a wide sample of subcultures. 
Some light can, however, be thrown indirectly 
on the generality of the test by comparing 
the mean scores obtained on comparable 
populations in different cultures. This present 
paper carries this out using college students 
in Australia and the U.S.A. as the compari- 
son populations. 

Subjects. The U. S. results are based on 
the integrative studies by Goodstein (4) and 
Black (quoted in 4) which summarize the 
results of a number of MMPI investigations 
of male and female undergraduates respec- 
tively. These two integrative studies have also 
been supplemented by a later study by Clark 
(2). Thus the U. S. college results are based 
on samples tested in various regions of the 
U.S.A., mainly in state colleges, and repre- 
senting varied academic fields.* 


1The completion of this study was made possible 
by a generous grant from the Carnegie Corporation 
of New York. The author’s thanks are given to Ron- 
nie Jennings and Iain Wyatt for hand-scoring the 
tests. 

2 Clark does not indicate whether he used the K 
correction. All of the other studies, including our 
own, used this correction. 


161 


The Australian subjects were first-year psy- 
chology students at the University of West- 
ern Australia (state) in 1953 and 1955. Per- 
sons who had not been educated in Australia 
or Britain were excluded and there were 65 
males and 67 females. The subjects took the 
test as part of their laboratory training; they 
were informed that the test was not compul- 
sory and that the results would be used for 
research purposes by the writer who was one 
of their instructors. The test was not anony- 
mous but there were only two protocols that 
were too incomplete to be used. 

In this type of study it is important to dis- 
cuss the comparability of the samples. The 
Australian sample may differ from at least 
some of the U. S. samples in the following 
ways: 

1. The students had elected psychology as one of 
their courses. 

2. The average age, especially of the males, was 
high owing to the presence of a number of part- 
time students. The males averaged 26.6 years (¢, 8.2) 
and the females 20.8 (¢, 7.3). 

3. As a result of the selective method of entry to 
the University there are few subjects with relatively 
low IQs. The mean IQ on a group test of entering 


freshmen is 124 and only approximately 15 per cent 
fall below 116. 


In the discussion section of this paper we 
shall point out why we consider that these 
biases in the Australian sample have not af- 
fected the findings. 


Results 


The means and standard deviations of the 
male and female samples are presented in 
Tables 1 and 2 respectively. 

The only significant differences between the 
U. S. and Australian means are on Mf; both 
the male and female Australians come out 
more feminine. On the male comparisons, the 








162 Ronald Taft 


Table 1 
Comparative MMPI Results for U. S. and Australian Samples—Males 

















1 3 4 
U. S. Males* 2 West Australian West Australian 
(N = 5742) Range of Males Males aged 
means in (N =65) 17-24 years t test 
Median Median the ten —_——--— (N=30) 3 vs. 1 
Scale Mean SD studies Mean SD Mean (¢ > 1.0) 
? : — _ 6.9 13.8 7.7 
L — — — 3.6 2.0 3.5 
FP — — — 4.5 3.2 4.9 
K 14.5 4.6 13.8-16.4 15.3 4.4 15.9 1.1 
(Raw score) 
Hs 52.5 8.3 50.5—54.1 53.0 8.1 52.3 
D 53.0 10.5 50.2-53.9 55.0 10.5 57.6 
Hy 55.5 7.8 §2.9-52.8 57.4 8.0 57.9 1.9 
Pd 56.0 9.9 54.3-58.1 57.4 11.9 56.1 — 
Mf 58.0 10.2 54.9-62.7 62.7 9.6 65.0 3.9 
Pa 53.0 8.0 50.0-54.1 53.8 9.8 54.6 -- 
Pt 56.4 10.2 53.6-58.0 54.8 9.3 57.9 1.0 
Sc 56.9 10.5 52.4-57.2 56.4 10.3 59.1 -- 


Ma 58.1 10.1 55.8-60.2 57.5 10.3 57.1 





* The U. S. figures are based on a combination of ten studies; nine which were integrated in (4) and a tenth (2), which recorded 
results derived from 707 college undergraduate males. 


Australian mean is outside the range of the SDs are significantly different (0.05 level) 
ten U. S. means only on the D scale. on Pd and Pa for the males and Pa and Sc 

The variability of the Australian subjects for the females. From the practical point of 
tends to be higher than the American and the view it is also important to compare the 


Table 2 


Comparative MMPI Results for U. S. and Australian Samples—Females 

















3 4 
West Australian West Australian 
1 2 Females Females aged 
U. S. Females* U. S. Femalest (N=67 17-21 years t test 
(N = 5014) (N = 1817) — (N=53) 3 vs. 1 

Scale Mean SD Mean SD Mean (¢ > 1.0) 

? —- —- 4.7 13.3 5.2 
L - == 4.4 2.3 4.6 
F — : 4.6 3.1 4.6 
K 15.5 -- 14.8 4.4 14.7 1.5 

(Raw score) 

Hs 49.0 7.0 50.4 8.0 50.1 1.3 
D 49.0 8.5 51.0 8.2 50.6 1.8 
Hy 53.5 7.75 53.7 8.2 53.2 — 
Pd 54.0 9.0 54.3 9.8 53.7 — 
Mf 50.5 9.25 47.3 9.2 47.2 2.9 
Pa 53.5 7.75 52.4 10.0 52.1 _— 
Pi 53.0 8.0 54.0 9.2 54.3 — 
Se 55.0 7.5 56.3 10.2 56.5 1.0 
Ma 55.5 10.25 55.2 10.7 54.5 — 











* The U. S. figures represent the median mean of 15 reported studies which were integrated *;y Black. The figures were esti- 
mated to the nearest 0.5 from the graph appearing in (4, p. 440). Further details of the integrative study could not be obtained. 
TT figures represent the mean (weighted) SD from the four studies that were available (2, 3, 5, 6). 






































A Cross-Cultural Comparison of the MMPI 


Table 3 


Comparative Levels of “Abnormal Scores” for U. S. and Australian Samples * 









Males Females 
T 70 Score Per cent over 70 T 70 Score Per cent over 70 
West West West West 
Scale U.S. Australian U.S.¢t Australian U.S. Australian U.S.+ Australian 
Hs 69 69 2.8 1.5 63 66 0.8 0.0 
D 74 76 5.4 9.2 66 77 1.6 4.5 
Hy 71 73 2.5 7.7 69 71 2.9 3.0 
Pd 76 81 8.0 15.4 72 74 44 3.0 
Mf 78 82 98 26.2 69 66 4.0 1.5 
Pa 69 73 3.0 7.7 69 72 3.8 4.5 
Pi 77 73 98 4.6 69 72 3.0 4.5 
Se 78 77 7.8 10.8 70 77 6.0 9.0 
Ma 78 78 98 12.3 76 77 9.5 7.5 
* Based on the means and SDs reported in Tables 1 and 2. 
t+ A weighted mean combining the results reported in (3) and (5). N = 1158 males, and 473 females 


scores of the subjects at the “abnormal” ex- 
treme; i.e., the 7 70 or plus 2 SD level. These 
data are presented in Table 3. 

The level of the T 70 scores differs by 5 or 
more on Pd for the males and D and Sc for 
the females. The Australian males obtain sig- 
nificantly more (0.05 level) abnormal scores 
on Hy, Pd, Mf, and Pa, but the females do 
not differ significantly on any of the scales. 
Fifty per cent of the Australian males obtain 
at least one abnormal score, compared with 33 
per cent of a U. S. sample (3), while the fig- 
ures for the females were 26 per cent and 27 
per cent respectively. 


Discussion 


The means of the American and Australian 
college samples differ only on Mf. At the same 
time it is worth noting that this scale has the 
highest range of all scales over the ten U. S. 
samples, and the Australian mean is within 
that range. It would seem that this scale is the 
one most susceptible to cultural influences, a 
conclusion which is consistent with the mode 
of constructing the scale in the first place, 
ie., distinguishing men’s and women’s re- 
sponses rather than the responses of some 
groups discriminated on personality criteria 
as in the other scales. 

Comparing the American and Australian 
samples at the “abnormal” level of the scales, 
the Australian males tend to score higher on 
Hy, Pd, Mf, and Pa, while the females are 


higher on D and Sc. If we judge the means 
and the 7 70 scores from a practical point of 
view, we might call scores “equivalent” when 
the differences are less than 5 points (half a 
standard deviation). Excluding also the Mf 
scales which differ significantly, we find that 
we can call seven of the male scales equiva- 
lent (not Mf or Pd) and six of the female 
scales (not Mf, D, or Sc). 

By and large, then, the scales “hold up” in 
the Australian cultural setting. Let us now 
consider the implications of the differences 
and resemblances which we have found. There 
are three main factors which may possibly 
lead to differences: 

1. Differences in the relationship between 
the college samples and the general American 
or Australian populations; 

2. Variations in the psychological signifi- 
cance of the items according to the cultural 
background of the subjects; and 

3. Personality differences between Austral- 
ians and Americans. 

In the introduction, some possible biases in 
the Australian sample were suggested, but 
there is some strong circumstantial evidence 
that these have not caused any of the differ- 
ences found. The Australian subjects were all 
taking a course in psychology, but Sopchak’s 
results (5) based on a similar sample do not 
show any of the variations from the other 
U. S. studies that were shown by the Aus- 
tralian subjects. The IQs of the Australian 


164 Ronald Taft 


subjects may also have been higher than some 
of the U. S. samples, but the differences, re- 
ported by Applezweig (1), between groups 
split on IQ at 115 show that the selectivity 
of the sample on intelligence could not ac- 
count for the differences on the MMPI. On 
the contrary, the subjects with the higher 
IQs had fewer abnormal scores on the scales 
on which the Australians had the greater 
number of such scores. The Australians were 
also possibly older than comparable U. S. 
groups, but the results presented in Tables 1 
and 2 for the younger subjects show that the 
differences found could not be attributed to 
this bias. 

We must look, therefore, to explanations 2 
and 3 for the differences that were found; 
these are the psychological “pull” of the items 
and personality differences between the two 
groups of subjects. Whether only one of these 
influences or both were operating, we cannot 
say on the data available. 

In the instances where no difference is 
found between the Australian and American 
results, it is possible that two or three of the 
possible causes of differences were operating 
and counterbalancing each other. It seems 
more parsimonious, however, to assume that 
none of them was operating. Therefore, cul- 
tural differences do not appear to have af- 
fected the psychological significance of the 
equivalent items; namely, Hs, D, Hy, Pa, Pt, 
Sc, and Ma for males, and Hs, Hy, Pd, Pa, 
Pt, and Ma for females. On the other scales 
where differences were found we cannot de- 
cide whether cultural differences were respon- 
sible without a careful cross-validation study 
over the Australian general population, repli- 
cating the original Minnesota study. 

In conclusion it should be remarked that 
the differences in the cultural setting of 
American and Australian university students 
are not radical, and the results of this study 
should not be taken as a justification for ex- 
tending uncritically the MMPI to vastly dif- 
ferent cultures.* 


3A report has just come to hand (N. D. Sund- 
berg: The use of the MMPI for cross-cultural per- 


Summary 


One test of whether an empirically vali- 
dated inventory is culture bound is to com- 
pare the results of the application of the in- 
ventory to two comparable populations in 
differing cultures. The MMPI was given to a 
sample of students at the University of West- 
ern Australia and the results were compared 
with the means and “abnormal” score levels 
of a number of American college samples. The 
Australian subjects scored higher than the 
Americans on Mf (male and female), Pd 
(males), D (female), and Sc (female). The 
scores on the other seven male and six female 
scales were equivalent and it is inferred that 
the MMPI items on these scales are not cul- 
ture bound, at least within the culture varia- 
tion studied. Where differences were found 
it is impossible to decide on the evidence 
whether they were determined by true per- 
sonality differences between the two groups 
of subjects, or by differences in the psycho- 
logical significance of the items from one cul- 
ture to the other. 


Received July 23, 1956. 


References 


1. Applezweig, M. H. Educational levels and Min- 
resota Multiphasic profiles. J. clin. Psychol, 
1953, 9, 340-341. 

2. Clark, J. H. The interpretation of the MMPI 
profiles of college students. J. soc. Psychol., 
1954, 40, 319-321. 

3. Dobson, W. R., & Stone, D. R. College freshman 
responses on the Minnesota Multiphasic Per- 
sonality Inventory. J. educ. Research, 1951, 
44, 611-618. 

4. Goodstein, L. D. Regional differences in MMPI 
responses among male college students. J. con- 
sult. Psychol., 1954, 18, 436-441. 

5. Sopchak, A. L. College student norms for the 
Minnesota Multiphasic Personality Inventory. 
J. consult. Psychol., 1952, 16, 445-448. 

6. Tyler, F. T., & Michaelis, J. V. A comparison of 
manual and college norms for the MMPI. J. 
appl. Psychol., 1953, 37, 273-275. 


sonality study, J. abnorm. soc. Psychol., 1956, 52, 
281-283) which indicates that German students ob- 
tain quite different norms on a German transiation 
of the MMPI. 











a hee ee. ° 


























Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Ernest G. Beier 
University of Utah 


Charles D. Smock 
lowa Child Welfare Research Station 


The April 1953 issue of this Journal car- 
ried an article by the present authors entitled 
“Response to the Human Face as a Standard 
Stimulus” (1). The paper examined several 
hypotheses rejating to subjects’ responses to 
pictures of people. Paul F. Secord (personal 
communication) pointed out that the find- 
ings were indeterminate due to the manner 
in which ,* was applied to the data. Conse- 
quently, it was decided to retest the hypothe- 
ses of the earlier paper with an experimental 
design yielding data that could be subjected 
to a definitive analysis.. The hypotheses are 
stated in the following questions: (a) Do male 
and female subjects give a different propor- 
tion of “like” responses to the photographs? 
(6) Do the male and female subjects respond 
differentially to the two sexes represented by 
the pictures? (c) Do the male and female 
subjects respond differentially to generations 
or age groups represented by the pictures? 
This paper presents the new experiment, com- 
pares the findings with those of the previous 
study, and discusses the results and conclu- 
sions of the present work. 


Method 


In this study the stimuli were pictures of 
human faces representing both sexes and dif- 
ferent generations or age groups. Subjects 
were male and female college students. The 
response consisted of the subject indicating 


1 The authors wish to express their appreciation to 
Dr. C. I. Bliss, Connecticut Agricultural Experiment 
Station and Yale University, for his very valuable 
assistance in the design and analysis of the present 
experiment. 


“Response to the Human Face as a Standard Stimulus”: 
A Re-examination 


and 


165 







Carroll E. Izard 


General Electric Company 


Roland R. Tougas 


University of Akron 


whether he “liked” or “disliked” the 
vidual in the picture presented to him. 

Subjects. The sample, 60 males and 60 fe- 
males were drawn from the ROTC unit and 
the College of School of Nursing at a mid- 
western university. 

Selection of pictures. Over one thousand 
pictures were obtained on the main street of 
Syracuse, New York. Brief information ob- 
tained about the people who volunteered to 
contribute their pictures indicated that they 
were from all walks of life. From this pool of 
photographs about 250 had been preselected 
according to the following criteria: (a) full 
face view; (5) nonemotional expressions; 
(c) equal number of males and females; (d) 
age variation. For this study a series of 60 
experimental pictures were chosen at random 
from this group of photographs with the re- 
striction that both sexes and three age groups 
(roughly 2-5; 18-25; and 45-50) were 
equally represented. 

Administration. The pictures were pre- 
sented one at a time to each subject. The in- 
structions were as follows: 


indi- 


We have here a series of 60 portraits of people’s 
faces. Obviously, when we are confronted with a 
stranger, looking at him and at his face particularly, 
we get certain first impressions about the person. In 
this research, we are seeking for your first impres- 
sion. We will show you the different pictures one at 
a time; some of them will impress you as very like- 
able people whose company you would like. Others 
may impress you as people you wouldn’t care to 
meet or associate with at all. What we want you to 
do is to tell us as we show you each picture, whether 
you feel the person is one you “like” or one that 
you “dislike.” Sometimes you may feel uncertain 
about whether you like or dislike the picture of the 


166 E. G. Beier, C. E. Izard, C. D. Smock, and R. R. Tougas 


person, but we want you to decide for every picture. 
It is very important to remember, however, that we 
are interested in your first impressions; so Yl not 
let you look at the picture very long—only about 
three or four seconds. When I turn each card over, 
all you need to do is to indicate whether your first 
impression is one of “like” or “dislike.” Do you have 
any questions before we begin? 

It should be noted the above procedures 
contain two changes from the original study: 
individual instead of group administration 
and instructions limiting responses to two 
categories (“like,” “dislike”) instead of four 
(“very likeable,” “mildly likeable,” “mildly 
disliked,” “definitely disliked’’). These changes 
were required by the experimental design. 

Experimental design and procedure. The 60 
experimental pictures represented an equal 
number of males and females. There were 20 
pictures (10 male and 10 female) for each of 
the generations—young, peer, and old. The 
age ranges for the three groups were roughly 
estimated at 2 to 5, 18 to 24, 45 to 50. The 
60 pictures were divided at random into five 
sets of 12 with the restriction that the two 
sexes and three generations be represented 
equally in each set. 

A latin square design was utilized since it 
was expected that order of presentation might 
be a complication. The design was essentially 
a random 5 X 5 latin square in which each 
cell was a random 12 X 12 latin square. The 
12 pictures in each set were assigned at ran- 
dom to the letters A through L, which in turn 
formed the “treatments” or individual units in 
a 12 X 12 latin square. Each set was as- 
signed to the letters a through e of the 5 x 5 
latin square. Actually, five random 12 X 12 
latin squares were made up for each set, and 
the resulting 25 12 X 12 latin squares were 
assigned at random to the cells of the 5 x 5 
latin square. This made a single 60 x 60 
latin square, the 60 columns representing sub- 
jects, the 60 rows order of presentation of the 
pictures, and the 60 “letters” the complete 
series of pictures described above. Sixty male 
subjects and then 60 females were assigned 
at random to the columns, and since the same 
pattern was repeated for males and females, 
the 60 pairs of subjects received the pictures 
in a different random order. 

The “dislike” and “like” responses of each 
subject to the pictures as presented were 


scored as 0 and 1 respectively. Since each 
entry in the two identical 60 x 60 latin 
squares was either 0 or 1, the usual analysis 
of measurements could not be applied. In- 
stead, three independent analyses based upon 
the column totals, row totals, and picture to- 
tals were computed for male and female sub- 
jects separately. For the analysis of variance, 
each total was converted to a percentage of 
60 and then transformed to angles by the in- 
verse sine transformation (2). 


Analysis and Results 


The analysis of variance based on column 
totals and presented in Table 1 tested the sig- 
nificance of the difference in response of male 
and female subjects to the photographs. The 
interaction or error variance was larger than 
that for pairs and that for male vs. female. 
Thus, there was no difference in average re- 
sponse among pairs or between sexes of sub- 
jects. However, the combined mean square 
with 119 degrees freedom showed a variation 
between subjects from 5 to 6 times larger 
than would be expected by random sam- 
pling from a single binomial population (0° 
= 820.7/60 = 13.68). 

The analysis of variance based on row to- 
tals tested for trends in the proportion of 
“like” responses as a function of the order of 
presentation. Separate trends were computed 
with orthogonal polynomials for a parabola 
from the row totals averaging the male and 
female subjects and from the corresponding 
differences between male and female sub- 
jects. Since residual variances from the sepa- 


Table 1 


Analysis of Variance Based on Column Totals (in 
Angles) Testing Significance of Difference 
Between Pairs and Sexes of Subjects 











Mean 
Source df square F 
Pairs of subjects 59 69.70 80 
Male vs. female subjects 1 37.30 A3 
Sex X pairs 39 86.63 1.00 
Total 119 77.82 5.69* 
Binomial! variance x 13.68 1.00 


* Significant at .0S level. 




















1h Se Tew ka 














Response to the Human Face as a Standard Stimulus 167 


Table 2 


Analysis of Variance Based on Row Totals (in Angles) 
Testing Significance of Any Trend in Proportion 

of “Like” Responses upon Position of 
Picture in Order of Presentation 


Mean 
Source df square F 
M—F (Male— Female) 1 43.44 
Order of presentation 
Linear (M+-F) 1 @32. 753° 
Quadratic (M+ F) 1 9.00 .98 
Linear (M—F) 1 4.78 52 
Quadratic (M—F) 1 1.67 18 
Error 114 9.21 1.00 .67** 
Binomial variance « 13.68 1.00 





* Significant at .05 level. 
** Significant at .01 level. 


rate analyses of the totals and the differences 
were homogeneous, they were pooled in a 
single error term and the combined results 
are presented in Table 2. The average linear 
effect was significant (F = 7.53, p < .01); 
indicating the subjects tended to give more 
“like” responses as they progressed through 
the series of 60 pictures. The tendency to 
give more positive responses as the experi- 
ment progressed, however, was not signifi- 
cantly different for males and females, as can 
be seen from the linear term for males minus 
females (M—F; F=.52). Both of the 
quadratic terms measuring simple curvature 
in the trend, were smaller than the error 
variance. The error term in Table 2 repre- 
sents within-subjects variation and is not ap- 
propriate for testing the difference in the 
average response of the male and female sub- 


jects (see Table 1). However, the error within 
subjects was significantly smaller (p = .001) 
than that expected for random binomial varia- 
tion, apparently reflecting the individual con- 
sistency of response. 

The third and principal orthogonal factor 
in the 60 X 60 latin square of initial O and | 
scores was that represented by the letters or 
picture stimuli. There were an equal number 
of pictures in the six categories in Table 3, 
and the 10 pictures in each category gave the 
“like” response totals in the first two columns 
of the table. Each picture occurred once but 
only once in each position of presentation 
from 1 to 60 and before each of the 60 male 
subjects, and similarly before each of the fe- 
male subjects. From the angles for the 60 ex- 
posures of each picture to subjects of the 
same sex, totaled by categories in the next 
two columns of Table 3, analyses of variance 
showed no differences between the five sets of 
12 pictures used in setting up the 60 x 60 
latin square, either on the average or in their 
interactions with picture category. Accord- 
ingly, the variation in angular scores among 
the 10 replicate pictures in each category 
formed the error terms in Tables 4 and 5. 
The average effect of picture category summed 
over both sexes of subjects was computed 
from the totals in the next to last column of 
Table 3, and the differential response of male 
and female subjects from the differences in 
the last column of Table 3. 

The analysis of variance for picture totals, 
the sum of “like” responses for males to each 
picture plus corresponding sum for females 
with both transformed to angles, is presented 


Table 3 








Total “like” responses 








in 600 by 
Picture Category Males Females 
Male young 443 500 
peer 407 394 
old 424 386 
Female young 431 464 
peer 465 478 
old 347 366 








Males Females M+F M-F 
609.6 675.5 1285.1 —65.9 
559.2 544.4 1103.6 14.8 
574.8 538.0 1112.8 36.8 
603.2 661.9 1265.1 — 58.7 
621.5 638.8 1260.3 —17.3 
505.1 527.6 1032.7 











168 E. G. Beier, C. E. Izard, C. D. Smock, and R. R. Tougas 


Table 4 


Factorial Analysis Based on Picture Totals (in Angles) Summed Over Both Sexes of Subjects 


Source 











Young and peer vs. older generation 
Young vs. peer generation 

Sex of picture 

Sex X young and peer vs. older 

Sex X young vs. peer 

Error 

Binomial variance 





* Significant at .05 level. 
** Significant at .01 level. 


in Table 4. By factorial analysis, a separate 
estimate of variance was computed for each 
of the five degrees of freedom associated with 
picture effects (two sexes and three genera- 
tions of pictures). As indicated by the first 
term in Table 4, the subjects preferred the 
pictures of the young and peer groups to those 
of the older generation (p < .05). A study of 
generation totals showed that the greatest 
contrast was young vs. old. Since no other 
picture effect approached significance, these 
preferences were independent of the sex of 
the person pictured. That a very marked in- 
dividuality attached to each picture in the 
responses of all subjects, however, is attested 
by an observed error variance far in excess of 
that expected. 

A similar analysis for differences in pic- 
ture totals (sum of “like” responses for male 


Testing the Significance of the Average Effect of Picture Category 


Mean 
df square 


i > | 


w 
NS | 
ra 

* 


1617.72 
433.85 
26.70 
367.29 
390.29 
310.59 22.70** 
13.68 1.00 


SF ics es te es 
me pi ae 
Sess 


subjects minus the corresponding sum for fe- 
male subjects) is presented in Table 5. The 
first three terms in Table 5 were significant— 
the average difference in response between 
male and female subjects and their differen- 
tial response to the generations represented 
by the pictures. The direction of this differ- 
ence in response is evident from the totals 
for young, peer, and older groups in Table 3. 
Although male and female subjects agreed in 
their preference for their own generation, the 
female subjects preferred pictures of the 
young generation and male subjects’ pictures 
of the older generation, the sex of the picture 
having no significant effect. Male subjects’ 
totals for the young and peer picture groups 
showed very little difference, while female 
subjects’ totals for the three age groups were 
fairly widely separated. The finding of a sig- 


Table 5 


Factorial Analysis Based on Differences Between Male and Female Subjects’ Picture Totals 
(in Angles) Testing the Significance of the Differential Response of 


Source 





Sex of subject 

Young and peer vs. older generation 
Young vs. peer generation 

Sex of picture 

Sex X young and peer vs. older 

Sex X young vs. peer 

Error 

Binomial variance 





| 


* Significant at .05 level. 
** Significant at .01 level. 


| 


Male and Female Subjects to Picture Categories 


Mean 
df square F 

106.03 5.50* 
101.01 5.24* 
186.36 9.67** 

59.08 3.06 

36.58 1.90 

19.31 1.00 

19.28 1.00 1.41 
13.68 1.00 





te <a aa 











Response to the Human Face as a Standard Stimulus 169 


nificantly larger number of positive responses 
among female than among male subjects 
seems to contradict the nonsignificance of the 
difference in Table 1. The comparison in 
Table 5, however, has the difference in re- 
sponse of male and female subjects to the 
individual picture as the unit and in con- 
sequence has a smaller more critical error 
variance than that in Table 1 where the in- 
dividual subject was the unit. 

Of especial interest is the much smaller 
error variance for the differences between 
male and female subjects in Table 5 than for 
their sum in Table 4, the latter being 16 
times as large as the former. In fact, the 
variance of the differences is the only one 
which does not differ significantly from bi- 
nomial expectation. This emphasizes again the 
individuality of the pictures used as stimuli. 
Randomly paired male and female subjects 
agreed within the binomial error, after ex- 
cluding differential sex responses to genera- 
tions of pictures, as to their liking for a given 
picture. Factors associated with each picture 
but not isolated in the present study were 
evidently of critical importance in the ob- 
tained response. 

Although the average difference in response 
among age-sex categories was not significant, 
some differences between male and female 
subjects were observed in each of the six pic- 
ture categories. The largest differences be- 
tween male and female subjects were in terms 
of the greater preference of the female sub- 
jects for the young female and young male 
categories. The next largest difference be- 
tween male and female subjects came from 
the male subjects’ greater preference for the 
older male category. Male subjects gave more 
“like” responses to peer males than peer fe- 
males, and female subjects preferred peer fe- 
males to peer males, but both these differ- 
ences were small. 


Discussion 


The three principal findings of the previ- 
ous study were: (a) male and female subjects 
gave a different like-dislike ratio for all age- 
sex categories combined; (0) the subjects as 
a group responded differently to the age-sex 
categories represented by the pictures; (c) 
male and female subjects responded with dif- 


ferent like-dislike ratios to four out of seven 
age-sex categories. Further, the previous study 
suggested that age was a more important de- 
terminant of differential preferences than sex. 

In the present study it was found that: (a) 
there was an average difference between male 
and female subjects in their responses to in- 
dividual pictures; (5) the subjects as a group 
responded more favorably to pictures repre- 
senting young and peer generations than to 
those of older adults; (c) male subjects re- 
sponded differently from female subjects to 
the age groups. 

In comparing the two studies, it should be 
noted that in the present study the difference 
between the total numbers of “like’’ responses 
given by male and female subjects was sig- 
nificant only when compared picture by pic- 
ture. Further, the first study indicated that 
age and sex of picture were important deter- 
minants of preference ratings while the sec- 
ond study showed age to be significant and 
sex not. However, the two studies were con- 
sistent in showing that peer females were pre- 
ferred to peer males and older males to older 
females. In both studies females gave the 
larger number of “like” responses, and the 
order of preference for age groups was the 
same. 

On the average, subjects gave more “like’ 
responses as the experiment proceeded. Since 
each pair of subjects received the pictures in 
a different random order, this observed linear 
effect was independent of any particular pic- 
ture or group of pictures. This position effect 
seems to be a hitherto unconsidered bias in 
studies of the present type, and one which 
could affect the responses to the Szondi, TAT 
and similar projective techniques. 

The decrease in number of “‘like’’ responses 
as age of the picture group increased sounds 
another note of caution in the use of human 
photographs as stimuli. Interpretations of re- 
sponses to “mother figures” and “father fig- 
ures” might well refer to the cultural pattern 
as well as to individual dynamics. Since male 
subjects responded differently from females to 
the generations represented by the pictures, 
this finding supports the suggestion made in 
the previous study that it might be well to 
consider establishing separate norms for each 
sex. 








170 E. G. Beier, C. E. Isard, C. D. Smock, and R. R. Tougas 


Further study of this different reaction of 
males and females might throw some light on 
the development of cultural-sexual roles, such 
as the point in the life span when this differ- 
ential response begins. 

The comparison of the observed error vari- 
ances with the variance expected from ran- 
dom sampling from a single binomial popula- 
tion yields some information on the usefulness 
of photographs of human faces as stimuli. 
From Table 1, the ratio of observed to ex- 
pected variance indicates significant variabil- 
ity between subjects in their response to the 
pictures, while the comparable ratio from 
Table 2 shows a greater than chance consist- 
ency of response within subjects. The ob- 
served-expected variance ratio from Table 4 
demonstrates the marked individuality at- 
tached to each picture in the responses of all 
subjects. As shown in Table 5, the error vari- 
ance for the differences between male and fe- 
male subjects is much smaller than that for 
their sum and is the only one which does not 
differ significantly from binomial expectation. 
After excluding differential sex responses to 
generations of pictures, randomly paired male 
and female subjects agreed within the bino- 
mial error as to their liking for a given 
picture. The variability due to pictures is ap- 
parently of more importance than that asso- 
ciated with subjects. This further emphasizes 
the individuality of the photographs of hu- 
man faces used as stimuli and indicates that 
factors associated with each picture but not 
identified in this study were of critical im- 
portance in the subjects’ responses. Similar 
stimulus characteristics for photographs of 
human faces were discussed by Izard (3) in 
a study which utilized preference ratings and 
verbalized projective responses. 


Summary 


Sixty photographs of human faces repre- 
senting both sexes and three generations were 
individually administered to 60 college males 
and 60 college females. Each pair of ran- 


domly assigned male and female subjects re- 
ceived the pictures in different random order. 
The subjects gave a “like” or “dislike” re- 
sponse to each of the 60 pictures. The “dis- 
like” and “like” responses were scored 0 and 
1. For male and female subjects separately, 
row totals, column totals, and picture totals 
were converted to percentages and then trans- 
formed to angles. The results of the analyses 
of variance indicated that: 


1. The pairs of subjects did not differ sig- 
nificantly. 

2. A significant difference in the average 
responses of male and female subjects could 
only be detected when compared picture by 
picture. 

3. Both male and female subjects tended to 
give more “like” responses as the experiment 
progressed. 

4. There was no difference in response of 
male and female subjects to the two sexes 
represented by the pictures. 

5. The subjects responded differently to the 
generations represented by the pictures. 

6. Male and female subjects differed in 
their response to the three generations repre- 
sented by the pictures. 

7. A comparison of error variances indi- 
cated a strongly specific response to the in- 
dividual pictures selected at random within 
each age-sex category. 


Received July 16, 1956. 


References 


1. Beier, E. G., Izard, C. E., Smock, C. D., & 
Tougas, R. R. Response to the human face as 
a standard stimulus. J. consult. Psychol., 1953, 
17, 126-131. 

2. Bliss, C. I, & Calhoun, D. W. An outline of 
biometry. New Haven: Yale Co-operative, 
1954. 

3. Izard, C. E. Perceptual responses of paranoid 
schizophrenic and normal subjects to photo- 
graphs of human faces. Unpublished doctor’s 
dissertation, Syracuse Univer., 1952. Amer. 
Psychologist, 1953, 8, 372-373. (Abstract) 


























Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 








Adjustment Testing and Personality Factors 
of the Blind 


Sidney I. Dean 


University of Portland 


There are an estimated 300,000 blind per- 
sons in the United States. Much of the in- 
formation about these people comes from 
anecdotal reports and conjectures. Very little 
can be stated with confidence about the at- 
tributes of the blind, and even less about the 
interplay between blindness and adjustment. 
Basic information is needed, and this study 
seeks answers to two rather broad, funda- 
mental questions: (a) what tests are of value 
in assessing the adjustment of the blind; and 
(6) what characteristics do these tests show 
to be typical of the blind? Answers to these 
queries, based upon explicit rationale and 
careful methodology, can offer some founda- 
tion which may act as a basis for a psychol- 
ogy of biindness. 


Problem 


Much material has been written about the 
blind, but descriptive phrases have only re- 
cently given way to testing and group in- 
vestigations. And, in general, the literature 
reveals that studies have been done too poorly 
or are too conflicting to provide reliable in- 
formation concerning adjustment to blind- 
ness. Barker et al. (1) indicated the concern 
of many workers with the blind when they 
wrote that most research to date consists 
merely of random exploratory collections of 
data. Worchel (10) discussed “shotgun” com- 
parisons, the limited number of subjects in 
many studies, and a number of other meth- 
odological errors. As recently as 1954 Bau- 
man and co-workers “. . . claim that this is 


the first scientific writing on adjustment to 
blindness” (3, p. 1). However, their study 
was avowedly “. . . to inquire into all the 
factors which seem likely to have some rela- 





171 


tionship to adjustment to blindness . . .” (3, 
p. 3). One of the best publications, a collec- 
tion of papers by Donahue and Dabelstein 
(6), is generally more suggestive than defini- 
tive, and opinions primarily support the pres- 
entations because of the paucity of facts. 

The present study is concerned with the 
problems of just what tests to use in evaluat- 
ing adjustment to blindness; the modifications 
which may be required in the interpretation of 
“sighted” test results with the blind; and any 
unique personality patterns related to blind- 
ness as such. The research was conceived as 
one for practical purposes involving voca- 
tional rehabilitation clients. The 54 blind sub 
jects of the investigation consisted of 34 males 
and 20 females, at present living in the State 
of Oregon, and probably no different in any 
major respect from the blind elsewhere. Many 
subjects came to Oregon from other states 
both before and after the acquisition of a 
visual impairment. Insofar as the national 
figures are known, the blind population sta- 
tistics for Oregon are proportionately com- 
parable. 


The Experimental Design 


The research is based on a 3 X 3X 2 fac- 
torial design with three replications. Each 
subject was classified in terms of: (a) ad- 
justment, good, fair, or poor; (6) duration of 
handicap, born blind, long-term with visual 
experience, and recent blind; and (c) re- 
maining acuity, little or no vision, and rela- 
tively “good” vision. Three subjects appear 
in each of the possible combinations of the 
above groupings in order to obtain a meas- 
ure of variability within a given subcategory. 
Table 1 shows the factorial design in tabular 


172 


Table 1 


The Factorial Design in Tabular Form 
(Indicating the number of subjects in each cell) 








Visual categories 











Adjust- GroupI GroupII Group III 

ment —_—_—— — — 

rating Tot Tvl Tot Tvl Tot Tvl Total 
Good nf $ 3 3g 18 
Fair 3.«3 3.3 3. 2 18 
Poor 2 @ > 3s > 2 18 
Total 9 9 Ping 9 9 54 





form, and in the following sections each clas- 
sification of the subjects will be discussed in 
detail. 


Adjustment. It has become almost traditional to 
speak of blind adjustment as involving the areas of: 
(a) freedom and ability to travel independently; 
(b) interpersonal relations and acceptance by so- 
ciety; (c) outlook on life; (d) self concept and self- 
acceptance; (e) work experience and attitudes; (f) 
social participation; (g) acceptance of limitations 
and use of ussets; (hk) satisfactory grooming and 
hygiene; (i) education and its application. These 
nine areas cover « number of kinds of behaviors and 
types of adjustments so that no measure which 
would attempt to account for all of them would be 
unitary. Taken together they offer an approach to 
evaluating blind adjustment. 

It was rather arbitrarily decided that two state- 
ments exemplifying good adjustment in each of the 
nine areas would be used as the basis for evaluating 
the subjects. Ratings were made on a five-point 
scale; the combined arithmetic means of the raters 
being accepted as the best over-all estimate of a sub- 
ject’s observable adjustment. Without a more defi- 
nite criterion available it is necessary to call upon 
experienced workers in the field to evaluate observ- 
able behaviors, and to use their consensus judgment 
as a measure of functioning adjustment. The eighteen 
statements were reworked until unanimity of con- 
cept was achieved by the judges. The judges were 
persons with years of experience in work with the 
blind and were themselves personally acquainted 
with visual handicap.1 All are employed at the Ore- 
gon State Commission for the Blind. 

As the judges’ ratings are basic to this study a 
measure of retest reliability was executed. Practical 
considerations precluded complete rerating, so the 

1 The investigator is grateful for the assistance of: 
Mr. Clifford Stocker, Administrator; Mr. Charles 
Brown, Director of Vocational Rehabilitation for the 
Blind; and Mr. George Howeiler, Supervisor of So- 
cial and Educational Services. 





Sidney I. Dean 


judges were asked three months later to rerate each 
subject on a five-point scale of “global” adjustment. 
Again the mean rating was taken to be the best 
estimate of the subjects’ adjustments. In the first 
ratings the subjects were listed in a random order, 
while in the second they were listed alphabetically 
by name as an aid in controlling for a “position” set. 

Despite “global” rerating, the combined judges’ re- 
ratings yielded a coefficient of reliability of .91 indi- 
cating high reliability and consistent judgments. Rat- 
ings were not made unless the judge felt he had 
sufficient data about the subject. In this way guess- 
ing was reduced and a “halo effect” minimized. All 
subjects were not known equally well by all raters; 
eight evaluations had to be based on the combined 
ratings of only two judges, but these were in rea- 
sonably close agreement. 

Duration of handicap. Three groups were distin- 
guished according to the amount of visual experience 
prior to blindness, and the duration of the visual 
condition. Group I: those who have been blind for 
over ten years and who became blind before their 
fifth year of life, and therefore are presumed to have 
minimal retention of visual imagery. Group II: those 
who have been blind for ten years or more and who 
became blind after their fifth year of life, and who 
are presumed to have residual visual imagery. Group 
III: those who have been blind for less than five 
years and who have had many years of visual ex- 
perience, and who may be expected to show some 
traces of the adjustment needs engendered by the 
loss of vision. 

Remaining acuity. The two subgroups based upon 
residual vision are the last to be considered in con- 
nection with the factorial design. (Tot): those who 
have a total loss and those with only light percep- 
tion, on the assumption that these conditions pro- 
vide a somewhat homogeneous visual group based 
on minimal useful sight. (Tvl): those who have 
travel vision and those with object perception, on 
the assumption that these conditions are similar visu- 
ally and are qualitatively different from the first sub- 
group 


Measures of Adjustment 


Subjects’ ratings. In order to measure the 
client’s evaluation of himself a list of 100 
statements was compiled from various sources. 
These items had been used in previous stud- 
ies to measure some phase of adjustment. 
Three major areas were utilized: (a) general 
adjustment; (5) adjustment to blindness; 


and (c) body image. These items were sub- 
mitted to judges with the instruction to choose 
those statements they felt would best discrimi- 
nate degrees of adjustment of blind persons. 
The judges for this list were three experi- 
enced workers with the blind, mentioned 














Adjustment Testing and Personality Factors of the Blind 


previously, the author, and three psycholo- 
gists at the University of Portland.’ 


The items selected for this study were those agreed 
upon by at least four of the judges. Whenever pos- 
sible only those items were used which showed 
agreement between the psychologists and the work- 
ers. After pretest trials ambiguous wording was re- 
duced without destroying the intent or meaning of 
the item. In this manner a 34-item self-rating scale 
was devised. 

The subject was asked to indicate, on the basis of 
a seven-point scale, how much each item applied to 
himself. In order to make a response on a seven- 
point scale possible for the blind subjects, a card 
with the seven responses in Grade II Braille down 
the left-hand side, and with large, black print down 
the right-hand side, was given to each subject as a 
memory aid while formulating his answers. Only a 
few subjects could use neither Braille nor print; for 
these the responses were read slowly through after 
each statement. 

The subjects were also asked to evaluate them- 
selves on an over-all or “global” basis by selecting 
one of five categories, from poor to very good, which 
they felt best described their adjustment. 


Psychological tests. A selection of tests was 
made so as to include those which: are pres- 
ently used with the blind; sample different 
aspects or levels of adjustment; have sighted 
norms; and are of various construction types 
or theoretical bases. 


Bauman’s (2) Emotional Factors Inventory (EFI) 
was selected as exemplifying the attempt to measure 
adjustment to blindness. It is an inventory type test 
which is new but apparently well received. This test 
purports to measure the areas of: Sensitivity, So- 
matic Symptoms, Social Competency, Paranoid 
Tendency, Feelings of Inadequacy, Depression, Atti- 
tudes re Blindness, and a measure of Validity. 

The Minnesota Multiphasic Personality Inventory 
(MMPI) is a commonly used test which was de- 
veloped on clinical populations. It has extensive 
norms and has been applied to the blind. Although 
a method of arriving at a single adjustment score 
may exist, the investigator could not find mention 
of one in a search through possible sources. For that 
reason the scoring method used in this study will be 
clarified. It was decided that maladjustment was in- 
dicated by two major dimensions; amount of devia- 
tion from the midline on the clinical scales, and the 
number of such deviations. By assigning the number 
one to scores falling within the range 41-59 a sub- 
ject scoring within this area on all ten factors would 


2? The investigator expresses his indebtedness to: 
Dr. William Botzum, Chairman, Department of Psy- 
chology; Dr. Gordon Higginson, Director, Psycho- 
logical Services; and Dr. Frank Strange, Staff Psy- 
chologist, Psychological Services. 


173 


receive a minimum score of 10. In a like manner, 
factors scored in the ranges of 60-69 and 31-40 
would carry a weight of two; scores below 30 or in 
the range 70-79 would be weighted three; and so on 
with scores above 80. Using this scoring procedure 
it was possible to get a single figure representative 
of both number of deviations and their extent. 

The Rotter Incomplete Sentences Blank (ISB) is 
a semiprojective technique consisting of 40 items 
The author (7) presents norms for normal and mal- 
adjusted populations. Scoring is based on rating the 
completion in terms of expressed conflict, with lower 
over-all scores indicating better adjustment. 

The Sargent Insight Test (Insight) is a new pro- 
jective technique which is applicable to the blind. 
Brief situations, termed armatures, are presented to 
the subject and scoring is based upon the answers to 
the two questions: “What did the person do, and 
why ?” and “How did the person feel?” Fifteen situa- 
tions constitute the test, with alternative forms for 
male and female subjects not used in this study. 
Quantification of responses is possible in the areas 
of Affect (A), Defense (D), and the ratio between 
them (A/D). Twelve “feeling categories” are also 
used to classify further all responses scored Affect 
The author (8) presents data from various groups 
of subjects so that comparisons are possible. The 
test not only aims to reach deeper aspects of the 
personality but it offers a measure of defensiveness 
at the same time. 

The Wechsler-Bellevue Intelligence Scale, Form I, 
Verbal Scale, was used to measure the subjects’ in- 
tellectual abilities. Although intelligence was not con- 
trolled by sampling in the study, its importance 
could not be ignored. By statistical analysis the in- 
fluence of intelligence upon the subjects’ scores could 
be controlled if necessary. The only change needed 
for blind subjects is the substitution of the alternate 
questions in the Comprehension subtest as suggested 
by Bauman and Hayes (4). 


Procedure 


The order of test presentation was ran- 
domly determined so that no position bias 
could occur, and the assignment of the sub- 
jects to the cells of the experimental design 
was as random as the design permits. Co- 
operation of the subjects was solicited through 
an appeal to the need for information about 
the blind, and with the understanding that 
the data would only be used in group terms. 
The method of presentation in all testing was 
oral, with the responses scored by the ex- 
aminer in every case. 

In the experimental design sex was not 
controlled as a factor. In order to determine 
whether sex differences existed in age, educa- 
tion, or judges’ ratings the data were sub- 
mitted to ¢ test. In no case could the null hy- 








174 


Table 2 


Means and Utandard Deviations of the 
Measures of Adjustment 





Test Mean SD 
Insight 58.07 14.56 
ISB 130.30 22.78 
MMPI 13.44 2.54 
EFI 5.65 62 
Wechs 113.56 13.02 
Self-eval 5.78* 74 
Adj. Rating 3.13F 94 

* Seven-point scale. 
t Five-point scale. 


pothesis be rejected, and it was concluded 
that only chance differences existed. Chi 
square, two-tailed and corrected for continu- 
ity, was employed to test other factors which 
might be affected by sex differences. The fac- 
tors tested were marital status, employment 
status, and years of work experience. Differ- 
ences in marital status and in the employ- 
ment status of the sexes were not large enough 
to reject the null hypothesis. However, differ- 
ences in years of work experience yielded a 
chi square large enough to be significant be- 
yond the 1% level. A tendency is revealed 
for the males to work less after suffering a 
visual loss, and for the females to work more 
at that time. This phenomenon might be 
worthy of more critical analysis than by the 
chi square computed in this study. It is rea- 
sonable to assume, then, that conclusions can 
be drawn equally about the sexes in further 
discussions of the subjects since both sexes 
are similar in the major attributes tested. 


Sidney I. Dean 


Results 


The means and standard deviations of the 
various measures, which will be discussed 
later in relation to test scores, are shown in 
Table 2. Intercorrelations between the vari- 
ous measures are presented in Table 3. Four 
of the intercorrelations are significant at the 
1% level, with the EFI and the self-evalua- 
tions showing greatest agreement. These re- 
lationships suggest that the EFI and the 
Wechsler, particularly in combination, might 
prove to be good predictors of the adjustment 
ratings. To test this suggestion these two 
measures, and the MMPI and self-evaluations 
which were significantly related to the ad- 
justment rating at the 5% level, were em- 
ployed in a multiple triserial (point biserial) 
correlation. The computations, explained in 
Wert et al. (9), resulted in an R, of .465 
which yields an R, of .51 when corrected for 
coarse grouping. This multiple correlation was 
tested by the F-ratio formula (9); with 4 
and 49 degrees of freedom the F of 3.38 is | 
significant beyond the 5% level. However, 
this R, is not so large as to account for more 
than about 26% of the variance to be found 
in the adjustment categories. This amount is 
so small as to make the computation of a dis- 
criminate equation a labor of minor produc- 
tiveness for practical predictions. 





Analysis of Measures of Adjustment 


Since this study rests upon the division 
of the subjects into adjustment categories 
through the use of mean ratings by profes- 
sional workers these ratings will be evaluated 
first. It is expected that adjustment cate- 


Table 3 


Intercorrelations Between Measures of Adjustment 


MMPI 


Test 
Insight 
ISB 
MMPI 
EFI 
Wechs. 
Self-eval. 








* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 


EFI Wechs. Self-eval. Adj. Rating 
13 .29* 05 05 
— .36** —.15 — .30* —.16 
— .25* -.12 — .28* —.24* 
.28* .62** = 
.00 39** 
.26* 





Adjustment Testing and Personality Factors of the Blind 


gories based upon the ratings will be well 
differentiated, and interest is centered upon 
differences in rated adjustment among the 
other two classifications. An analysis of 
variance for the judges’ adjustment ratings 
yielded an F well beyond the 1% level as 
expected; however, the time of occurrence 
also yielded an F significant at the 5% 
level. The judges, apparently, tended to favor 
Group II subjects. Since intelligence was 
shown to be significantly related to the ad- 
justment rating, an analysis of covariance 
was employed to determine whether ratings 
were influenced by subjective evaluations of 
intelligence. The covariance between intelli- 
gence and judges’ adjustment ratings yielded 
an F value beyond the 1% level of confidence 
with occurrence and acuity nonsignificant. 
Adjustment accounted for almost all of the 
variation, and intelligence appears to be the 
determinant which made Group II scores 
somewhat higher. 

The subjects’ self-evaluation scores were 
next submitted to an analysis of variance. 
The residual error term accounted for enough 
variability to preclude computation of the F 
ratios. It was concluded, with no significant 
Fs, that the subjects’ self-evaluations were 
not patterned in terms of the adjustment 
categories nor in terms of the visual group- 
ings. 

The EFI scores earned by the subjects were 
placed in the factorial design and an analysis 
of variance computed; none of the F ratios 
was significant. Since a significant relation- 
ship exists between the EFI and the Wechsler, 
an analysis of covariance was made to deter- 
mine whether control of intelligence would 
permit better discrimination. The adjusted 
means were so low as to disallow the com- 
putation of the F ratios. It was concluded 
that the EFI does not discriminate adjust- 
ment of the subjects as defined by the nine 
areas and evaluated by the judges. It also 
appears that with intelligence controlled the 
variability is reduced so that rather than 
masking the discrimination powers of the 
EFI, intelligence seems to contribute some- 
thing to the ability of the test to measure 
degrees of adjustment. 

The MMPI was the next measure consid- 
ered. By placing the previously mentioned 


175 


composite scores into the experimental de- 
sign an analysis of variance could be com- 
puted. None of the major categories gave a 
large enough mean square to compute the F 
ratio. With the triple interaction, however, 
there is a significant rejection of the null hy- 
pothesis at the 5% level. This finding is not 
so valuable for practical use as it is for pos- 
sible follow-up. An analysis of covariance was 
not deemed necessary, and it was concluded 
that the MMPI does not adequately differ- 
entiate the subjects in terms of the criterion 
adjustment. 

An analysis of variance was next done with 
the Rotter (ISB) scores. The results indi- 
cated that the total variability was quite 
small and that no category accounted for 
enough of the variance to necessitate compu- 
tation of the F ratios. It was concluded that 
the objective scoring of the ISB does not dis- 
criminate in terms of the adjustment criterion 
used in this study; more value may reside in 
the qualitative data available from the test. 

An analysis of variance was executed on 
the A/D ratio scores earned by the subjects 
on the Insight test. Only one interaction was 
larger than the residual despite the large 
variability, but the resulting F ratio was not 
large enough to reach significance. As the In- 
sight is correlated with intelligence at the 5% 
level, the data were subjected to an analysis 
of covariance. With intelligence controlled no 
variability was larger than the error term. 
The test does not discriminate adjustment on 
the basis of the A/D ratio. 

Intelligence so far has been considered only 
for its possible influence upon the other tests 
with which it is correlated. It was next con- 
sidered in its own right as a possible indi- 
cator of adjustment. The results, however, 
showed no significant F ratios. It was con- 
cluded that intelligence does not vary signifi- 
cantly with the adjustment criterion. It would 
not be expected that visual grouping would 
vary systematically unless a bias had been 
introduced into the study. This affords some 
indication that the design was adequate for 
evaluating the tests. 


Characteristics of the Blind 


The author of the EFI gives various norms, 
the latest of which were based upon an N of 








176 Sidney I. Dean 


200 and presented in 1954. The subjects of 
the present study appear to be above the av- 
erage for these norms, scoring even higher 
than the nonhandicapped persons tested by 
the EFI author. It seems reasonable to con- 
clude that the items used in the various areas 
of the test were assumed a priori to discrimi- 
nate adjustment. However, they show too 
much variability and need more refinement 
for value in individual prediction. 

The MMPI psychographs (short form plus 
full K and Si scales) were plotted for the 
mean scores made by the male and female 
subjects. For both sexes the profiles were 
within normal limits on all areas. For both 
sexes, also, there were three “peaks” which 
occur for the K factor, the Mf score, and the 
Ma score. If this pattern is in any way typi- 
cal of blind subjects, it is in areas which have 
not been emphasized or explored. The only 
divergence in sex patterning was in the area 
of social interests, with women scoring above 
the midline and men below. 

Rotter has suggested that a cutting score 
of 135 on the ISB will correctly identify 75- 
80% of the maladjusted cases. The subjects 
of this study scored higher than the adjusted 
population or the college freshmen norms pre- 
sented by the author, but they scored lower 
than the maladjusted groups. If the reasons 
were known why the college group scored 
higher than the adjusted group it would aid 
in hypothesizing the meaning of the blind 
subjects’ mean. 

The Insight test manual presents only two 
protocols from blind subjects so that only 
casual comparison can be made with the sub- 
jects of this study. However, these Insight 
cases and the subjects of this study resemble 
each other more than they resemble the non- 
blind samples furnished by the test author. 
In both cases the Affect-Defense ratio (A/D) 
is much lower for the blind; the mean for 
feeling or action (A) is relatively low, the de- 
fensiveness (D) approaches the norm group, 
malignancy scores (M) are higher, and ag- 
gressive-passive feeling dominates. In general, 
the subjects of this study are more like the 
clinical groupings than like the control group 
of the norms. It is concluded that the Sargent 
norms should be applied with caution to blind 
subjects. 


The mean (verbal) intelligence quotient 
was in the bright-normal range, with about 
19% of the subjects earning scores below av- 
erage. This is the usual finding with blind 
subjects on the Wechsler Form I. As a group 
the subjects were lowest on immediate mem- 
ory (likely to be affected by tension), and 
were highest in differentiating essentials from 
nonessentials. The intellectual processes of 
the blind appear to be no different from those 
of the sighted, although tension may be some- 
what more typical of the blind than of the 
general population. 

Discussion 

Although it was anticipated that discrimi- 
nation among the adjustment categories would 
improve as the tests became more “projec- 
tive,” such was not the case. None of the 
tests was able to differentiate good or poor 
adjustment by a direct comparison of single, 
representative scores. It is obvious that the 
less structured tests offer much qualitative 
data which is lost in the use of single scores. 

The present study indicates that the MMPI 
is applicable to the blind without modifica- 
tion. This finding is in agreement with Cross 
(5), and the need for separate blind norm 
tables is not indicated. With the Insight test, 
however, Sargent’s norms (8) differ enough 
to suggest that such norms should be cau- 
tiously applied to the blind. Further investi- 
gation may provide blind norms. The EFI 
and MMPI scores suggest that the blind are 
not paranoid or depressed as a group; a find- 
ing at variance with previous assumptions. 
The MMPI further suggests three scores 
which might distinguish the blind. And the 
significant triple interaction in the MMPI 
analysis of variance indicates that adjust- 
ments may be differentiated if the variables 
of duration and acuity are controlled. 

The Rotter ISB method of scoring may 
tend to obscure levels of adjustment through 
a process of cancellation. Its value with the 
blind probably resides in qualitative rather 
than quantitative evaluations. The Insight 
will probably lose its major value if only a 
single score such as the A/D ratio were to be 
used. When subscores are utilized, for in- 
stance, the patterning is similar to that for 
hysterics presented by Sargent (8). 








Adjustment Testing and Personality Factors of the Blind 177 


The present study was rather iconoclastic, 
for the findings indicate that some previous 
studies may not have been as indicative of 
“adjustment to blindness” as the authors may 
have hoped. Since somewhat unusual cate- 
gories of vision were used in this study it 
would seem worth while to investigate fur- 
ther these and other groupings which will 
avoid the arbitrary number values now em- 
ployed. Some leads furnished by this study 
may yield more value than those attempts to 
show that the blind are “really” more mal- 
adjusted than the sighted. 


Summary 


The study was designed to evaluate various 
“representative” tests which are or could be 
used with the blind, and to discover factors 
which might be attributed to blindness. The 
subjects, 34 male and 20 female, were rated 
for adjustment, divided into groups on the 
basis of duration of handicap, and subcate- 
gorized on the basis of visual acuity. The ex- 
perimental design, then, was a 3X 3X2 
factorial with three replications. 

The tests selected were the: Subjects’ self- 
evaluations, Emotional Factors Inventory, 
Minnesota Multiphasic Personality Inven- 
tory, Rotter Incomplete Sentences Blank, 
Sargent Insight Test, and the Wechsler-Belle- 
vue Intelligence Scale. The judges’ adjust- 
ment ratings were found to be highly reliable 
by test-retest, and appeared adequate as the 
adjustment criterion. The major results may 
be summarized as follows: 

1. In general, the various analyses of vari- 
ance and covariance computed from the single 
test scores did not reveal significant differ- 
ences between the subgroups of blind subjects 
on the experimental variables. 

2. A multiple triserial correlation proved 


to be significant but of limited value in pre- 
dicting behavioral adjustment from the single 
test scores. 

3. Norm comparisons showed that the 
MMPI can be used without modification 
with the blind, and that the Insight pattern- 
ing of answers suggests a cautious use of the 
author’s norms with the blind. 


Received July 6, 1956. 


References 


1. Barker, R. G., Wright, Beatrice A.. Meyerson, 
L., & Gonick, Mollie. Adjustment to physical 
handicap and illness: a survey of the social 
psychology of physique and disability. New 
York: Social Science Research Council, 1953, 
Bulletin 55. 

2. Bauman, Mary K. A comparative study of per- 
sonality factors in blind, other handicapped, 
and non-handicapped individuals. Washing- 
ington: U. S. Office of Vocational Rehabilita- 
tion, 1950, No. 134. 

3. Bauman, Mary K. (Ed.) Adjustment to blind 
ness. Commonwealth of Pennsylvania: Depart 
ment of Welfare, 1954. 

4. Bauman, Mary K., & Hayes, S. P. A manual for 
the psychological examination of the adult 
blind. New York: Psychological Corp., 1951 

. Cross, O. H. Braille edition of the MMPI for 
use with the blind. J. appl. Psychol., 1947, 31, 
189-198. 

6. Donahue, Wilma, & Dabelstein, D. (Eds.) Psy- 
chological diagnosis and counseling of the 
adult blind. New York: American Founda 
tion for the Blind, 1950 

7. Rotter, J. B., & Rafferty, Janet E. Manual: the 
Rotter incomplete sentences blank. New York: 
Psychological Corp., 1950 

8. Sargent, Helen D. The insight test. New York 
Grune & Stratton, 1953. 

9. Wert, J. B., Neidt, C. O., & Ahmann, J. S. Sta- 
tistical methods in educational and psycho- 
logical research. New York: Appleton-Cen- 
tury-Crofts, 1954. 

10. Worchel, P. Psychological implications of blind- 
ness. Seer, 1954, 24, 26-33. 


“ 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Childrearing Attitudes of Emotionally 
Disturbed Adolescents’ 


George Spivack 


Devereux Schools 


Various studies over the past decade sug- 
gest a marked consistency in childrearing atti- 
tudes of mothers of sick children. The pur- 
pose of the present study was (a) to examine 
the childrearing attitudes held by emotion- 
ally disturbed adolescents, and (0) to see if 
their attitudes approximate those of mothers 
of sick children, which would suggest atti- 
tude perpetuation. 

A 64-item attitude survey was administered 
to 34 male and 32 female adolescents being 
treated in a residential setting. A normal 
residential school group of 34 males and 45 
females was the control. Each item was previ- 
ously classified by a psychiatrist, two psy- 
chologists, and a social worker as reflecting 
(a) restrictive control, (5) ineffectual control, 
(c) excessive devotion, or (d) cool detach- 
ment in parental attitude toward a child. 
Each S was required to strongly agree, mildly 
agree, strongly disagree, or mildly disagree 
with each item. Each item response was 
weighted in such a manner that the experi- 
mental and control groups could be compared 
on each of the four subscales through the ¢ 
test, and could also be compared on each item 
individually through chi square. 

Comparison on the four subscales revealed 
that the emotionally disturbed adolescents of 
both sexes expressed a significantly more re- 
strictive controlling attitude than the con- 
trols (.05 level). No significant differences 
were found on the other three subscales. Sig- 


1An extended report of this study may be ob- 
tained without charge from George Spivack, Dev- 
ereux Schools, Devon, Pennsylvania, or for a fee 
from the American Documentation Institute. Order 
Document No. 5103, remitting $1.25 for microfilm 
or $1.25 for photocopies. 


nificant chi squares (.05 level or better) were 
obtained on 14 items for the males, and 5 
more items approached significance (.10 
level). There were 14 significant items for 
the females, 1 approaching significance. 

The results lend support to the hypothesis 
that childrearing attitudes are perpetuated in 
the case of overcontrolling and restricting atti- 
tudes, but not in the case of attitudes reflect- 
ing ineffectual control, excessive devotion, or 
cool detachment. The absence of positive re- 
sults in the latter cases suggests that such 
attitudes are not perpetuated in any direct 
sense, or that if perpetuated, their expression 
is too subtle for such a questionnaire to pick 
up. The positive results suggest the impor- 
tance of further exploration into attitude per- 
petuation, particularly the means whereby 
attitudes are handed down to or defined for 
the younger generation. Approaching the re- 
sults in terms of what light they shed on ado- 
lescent adjustment generally, there is indica- 
tion that emotionally disturbed adolescents, 
much more than normal adolescents, feel a 
strong need for parental or parental-surrogate 
imposition of external controls; they seem to 
feel a stronger need to conform to what they 
see as parental values and standards of right 
and wrong. These results suggest that the sup- 
posed adolescent rebellion is less character- 
istic of emotionally disturbed adolescents than 
of normal adolescents, and that overt behav- 
ior labeled as rebellion in these children is not 
a positive drive for independence and a search 
for new values, but rather a confused search 
for self-definition and standards of conduct 
to follow. 


Brief Report. 
Received December 21, 1956. 


178 

















Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Age, Vocabulary, Anxiety, and Brain Damage 
as Factors in Verbal Learning’ 


J. P. S. Robertson 
Netherz Hospital, Coulsdon, England 


There are still many difficulties in deciding 
whether inefficiencies shown by neuropsychi- 
atric patients in tests of memory point to 
brain damage or result from factors such as 
advanced age, poor verbal ability and test 
anxiety (1). This investigation was conducted 
to throw light on the relative importance of 
the latter factors in the verbal learning of pa- 
tients without brain damage and to see, when 
they were taken into account, if patients with 
brain damage gave a less efficient perform- 
ance than others. 


Patients Tested 


The undamaged patients were drawn from 
relieved psychotics awaiting discharge and 
chronic psychotics on parole who were in 
stable hospital employment. A classification 
of suitable patients was made according to 
sex, age, and vocabulary level. The age 
classification was into young (20-39), middle- 
aged (40-59), and old (60-79). The vocabu- 
lary classification was based on the Wechsler- 
Bellevue Vocabulary score as low (under 23) 
and high (23 or above). Ten patients of each 
sex were examined for all combinations of age 
and vocabulary level, 120 undamaged patients 
in all. They were drawn in alphabetical order 
from appropriate wards until the requisite 
numbers were made up. The mean ages were: 
young 29.8, SD 6.7; middle-aged 50.7, SD 
5.6; old 67.3, SD 5.0. The mean vocabulary 


1 The author is grateful to Dr. R. K. Freudenberg, 
physician superintendent, Netherne Hospital, for fa- 
cilities to conduct this investigation, and to Dr. A. 
Walk, physician superintendent, Cane Hill Hospital, 
Coulsdon, for providing additional brain-damaged 
patients. 


scores were: low group 16.9, SD 3.6, 
high group 30.3, SD 4.4. 

The brain-damaged patients were all such 
(excluding arteriosclerotics) who were pres- 
ent in two neuropsychiatric hospitals in July 
1955, and were able or willing to cooperate. 
They were classified according to age and vo- 
cabulary in the same way as the undamaged 
group. They were also classified in accord- 
ance with their neuropsychiatrist’s opinion as 
showing mild or severe dementia. According to 
age they were: young 4, middle-aged 29, and 
old 26. According to vocabulary they were 
high 18 and low 41. According to dementia 
they were: mild 38 and severe 21. The diag- 
noses of the mildly demented were: head in- 
jury 1, cerebral tumor 1, cerebral atrophy 7 
paresis 11, vascular rupture 1, carbon mon- 
oxide poisoning 1, alcoholic dementia 3, 
Huntington’s chorea 2, organic senile de- 
mentia 11. The diagnoses of the severely 
demented were: head injury 1, cerebral 
atrophy 3, paresis 3, vascular rupture | 
alcoholic dementia 2, Huntington’s chorea 5 
organic senile dementia 6. 

In regard to test anxiety all patients were 
classified according to whether or not they 
displayed observed and stated anxiety. They 
were said to show observed anxiety if the 
tester assessed them as anxious on the evi- 
dence of their test behavior. They were said 
to show stated anxiety if they answered posi- 
tively the question: “Did you feel nervous 
while doing these tests or that you must get 
them right ?” The tester’s assessment was re- 
corded prior to asking this question. The pa- 
tients were also asked if they had had any 
reason to complain of memory difficulties in 
the recent past. 


and 


179 








180 


Tests Employed 


The kinds of verbal learning investigated 
were paired-associate learning and cumulative 
rote learning. Since a common complaint of 
memory difficulties concerns trouble with per- 
sonal names the paired associates were based 
on these: Grocer-Brown, Butcher-Thomson, 
Baker-Williams, Fishmonger-Todd, Chemist- 
Jones, Draper-Cook, Plumber-Smith, News- 
agent-Hunt, Greengrocer-Robinson, Ironmon- 
ger-Lewis. The instructions were: “This is a 
test of your memory for names. There were 
ten shops in a village street. I’m going to tell 
you the names of the shopkeepers and what 
the business of each was. Then I want you to 
tell me the name of each shopkeeper when I 
say his business.” The associates were pre- 
sented auditorily in six cycles, the order of 
each cycle being derived first from randomiza- 
tion and then rearrangement so that in the 
total no two names were in immediate suc- 
cession more than once. All associates were 
said before each cycle in the order of that 
cycle. The administration was in the form: 
“The grocer’s name was .. .” The patient 
was allowed up to 15 seconds to answer, then 
prompted. One mark was given for each cor- 
rect answer so that the maximum was 60; 
the obtained range was 0-57. A score was 
also kept of all names offered which were en- 
tirely outside the list (e.g., Murphy); these 
were termed paramnesias. A parallel test 
based on women’s names was administered to 
81 patients in the undamaged group at six 
weeks after the first testing. 

The cumulative rote-memory test was ad- 
ministered in the style of “This was the House 
that Jack Built.” Five lists were used, each 
comprising 15 names of the same class of ob- 
ject: (a) animals, (6) body parts, (c) ob- 
jects connected with eating, (d) fruits, (e) 
tools. The patient was presented auditorily 
with one item, two items, three items, etc., 
without variation of order until he made an 
error, when the next list was begun. The 
score was the grand total attained without 
error, the maximum being 75 and the ob- 
tained range 15-42. A parallel set of lists 
was administered to 81 patients six weeks 
after the first testing. 

At the end of the first session each pa- 


J. P. S. Robertson 


tient was asked whether he had made use of 
mnemonics, visual images, or other devices in 
learning. 


Relationship of Tests 


The paired-associate and rote-memory tests 
at the first session had a product-moment cor- 
relation of .32. The parallel versions of paired 
associates correlated .72 and those of rote 
memory .78. The split-half corrclation of the 
first paired-associate test (with Spearman- 
Brown correction) was .95 for odd and even 
cycles, and .80 for items which were odd and 
even in the order of the first cycle. The first 
and third lists of the rote-memory test cor- 
related with the second and fourth .77. In 
both paired-associate tests the items differed 
very significantly in their degree of difficulty 
but the factors governing this were complex. 
There was no significant relation to the fre- 
quency of the personal names in the commu- 
nity. The item presented first in the first cy- 
cle had an advantage in each test. Difficulty 
of the rote lists varied according to knowledge 
and interest. 

Observed and stated test anxiety showed a 
fairly close correspondence. Their uncorrected 
phi coefficient was .75. Complaints of mem- 
ory difficulties showed little correspondence 
with anxiety. The phi coefficient with ob- 
served anxiety was .02 and with stated anx- 
iety .12. 


Sex, Age, and Vocabulary in Undamaged 
Patients 


The relative importance of sex, age, and 
vocabulary level in the performance of the 
undamaged patients on the first versions of 
the paired-associate and cumulative rote tests 
and also in regard to paramnesias was deter- 
mined by analysis of variance. In paired-as- 
sociate learning sex differences were not sig- 
nificant, age differences were significant at 
the 5% level, and differences according to 
vocabulary significant at the 1% level. The 
old had lower scores than both young and 
middle-aged, but the latter did not differ. The 
low vocabulary patients had markedly lower 
scores than the high vocabulary ones. In 
cumulative rote learning sex was not signifi- 
cant, age was significant at the 5% level and 
vocabulary significant at the 1% level. Here 








Factors in Verbal Learning 


the middle-aged and old had lower scores than 
the young but did not differ from each other. 
The difference for vocabulary level was as 
marked as in paired-associate learning. In 
paramnesias sex was significant at the 5% 
level, age was not significant, and vocabulary 
was significant at the 5% level. They were 
commoner in males and the low vocabulary 
patients. 


Anxiety and Other Factors in 
Undamaged Patients 


The significance of differences in the occur- 
rence of anxiety, memory difficulties, and in- 
terpolated aids was determined by chi square 
or Fisher’s exact method. The effect of these 
factors on performance was assessed by ¢ 
tests. Observed anxiety was commoner in 
young than old to an extent approaching 
significance. Stated anxiety was significantly 
commoner in young than old at the 1% level 
and in young than middle-aged to an extent 
approaching significance. When comparison 
was made within the separate age groups, 
neither observed nor stated anxiety had any 
significant relationship to efficiency of per- 
formance in paired-associate or rote learning 
nor to the occurrence of paramnesias. Com- 
plaints of memory difficulties had no signifi- 
cant relation to sex, age, or vocabulary nor to 
efficiency of performance in the tests. 

Certain patients showed striking differences 
in score on the two versions of paired-asso- 
ciate learning. To a much smaller extent this 
also occurred in rote learning. Comparison 
was made between those whose change was 
more than one SD of the distribution of 
changes and those where it was not, in re- 
gard to sex, age, vocabulary, anxiety, and 
memory difficulties. In paired associates the 
only significant difference was in regard to 
age. Fluctuations of score were commoner in 
the middle-aged than either young or old, 
which does not seem meaningful. In rote 
memory there were no significant differences. 

Systematic use of interpolated aids was 
made by seven patients, partial use by 32 pa- 
tients, and the remainder relied on direct 
recollection. The use of aids had no signifi- 
cant relation to sex, age, or vocabulary, and 
did not significantly improve efficiency. 


181 


Effects of Brain Damage 


The tests were administered to the brain 
damaged on one occasion only. Comparisons 
were made by ¢ tests within the brain-dam- 
aged group and between it and the undam- 
aged one. The young brain-damaged patients 
were ignored in age comparisons but included 
in the others. 

The middle-aged and old brain-damaged 
patients did not differ significantly on paired 
associates but the middle-aged were signifi- 
cantly better at the 5% level on rote mem- 
ory. The high vocabulary brain-damaged pa- 
tients were significantly better at the 1% 
level than the low vocabulary ones on both 
paired-associate and rote learning. As in the 
undamaged patients paramnesias were signifi- 
cantly commoner among males than females, 
but showed no difference according to age or 
vocabulary. The mildly demented were sig- 
nificantly better at the 5% level than the 
severely demented on rote memory but did 
not differ significantly on paired associates or 
paramnesias. There were no significant dif- 
ferences within the brain-damaged group ac- 
cording to observed or stated anxiety or 
memory difficulties. 

The brain damaged were compared with the 
undamaged in the four subclasses combining 
middle or old age with high or low vocabu- 
lary. They were significantly poorer in each 
subclass on paired associates. They were also 
significantly poorer on rote learning among 
the old low vocabulary patients and almost 
so among the old high vocabulary patients 
but did not differ among the middle-aged. 
The brain-damaged did not differ significantly 
from the undamaged in regard to paramnesias 
or observed and stated test anxiety. Signifi- 
cantly fewer brain-damaged patients at the 
1% level complained of memory difficulties. 
Certain brain-damaged patients gave a rela- 
tively good learning performance, i.e., were 
one SD or more above the mean of the un- 
damaged patients in the same age and vo- 
cabulary subclass. This happened more often 
with rote memory than paired associates. In 
the former, but not the latter, it happened 
more often in the mildly demented. There 
was no relation apparent in this to neuro- 
psychiatric diagnosis. 





182 


Discussion 


Complaints of memory difficulties in them- 
selves would appear to be an unsatisfactory 
pointer to brain damage since the undam- 
aged more frequently make them. In verbal 
learning of the paired associate and cumula- 
tive rote type the efficiency of both undam- 
aged and damaged patients corresponds most 
closely to vocabulary level. Age is also a fac- 
tor of some importance, but anxiety in the 
test situation appears to be a negligible in- 
fluence. In testing for verbal memory defects, 
therefore, it would appear necessary and suffi- 
cient to allow for age and verbal ability. The 
marked fluctuation of efficiency from one test- 
ing to another in undamaged patients on 
paired-associate learning suggests that tests 
should be applied more than once before 
brain damage is inferred. There can be little 
doubt that, when age and vocabulary level 
are allowed for, brain-damaged patients are 
inferior to other patients on paired-associate 
learning. The existence of a few exceptions 
invites further enquiry. The position in cumu- 
lative rote learning is less clear but it would 
seem that brain damage lowers scores only 
when it exacerbates the effects of advanced 
age. The results on anxiety are discordant 
with a considerable body of work on the re- 
lation between manifest anxiety and learning 
in students (2, 3). This may depend either 
on the method of assessing anxiety or on the 





J. P. S. Robertson 


population investigated. The method of as- 
sessing anxiety used here, however, seems to 
reflect closely the realities of the immediate 
situation in which memory is tested. 


Summary 


Factors influencing efficiency in paired-as- 
sociate and rote verbal learning were investi- 
gated in relation to 59 brain-damaged and 
120 other neuropsychiatric patients. Vocabu- 
lary level and age were significant influences, 
but test anxiety was negligible. When vocabu- 
lary and age were allowed for, the brain 
damaged were significantly less efficient than 
the undamaged on paired-associate learning, 
but the position on rote learning was less 
clear. Some undamaged patients showed strik- 
ing fluctuations in paired-associate learning, 
when tested a second time. 


Received June 8, 1956, 


References 


1. Morrow, R. S., & Cohen, J. The diagnostic mem- 
ory scale. 1. Comparison of brain-damaged 
patients and normal controls. Trans. N. Y. 
Acad. Sci., Ser. 11, 1952, 14, 241-246. 

2. Lazarus, R. S., Deese, J., & Osler, Sonia F. The 
effects of psychological stress upon perform- 
ance. Psychol. Bull., 1952, 49, 293-317. 

3. Taylor, Janet A.. & Chapman, Jean P. Anxiety 
and the learning of paired-associates. Amer. 
J. Psychol., 1955, 68, 671. 











Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


The Stability of the Social Desirability Scale Values 
in the Edwards Personal Preference Schedule’ 


C. James Klett * 


University of Washington 


The Edwards Personal Preference Schedule 
(EPPS) (5) makes use of a forced-choice 
technique in which items pertaining to differ- 
ing psychological needs but having compa- 
rable social desirability scale values are paired 
in an attempt to minimize the subjects’ natu- 
ral tendency to respond in the socially ap- 
proved direction. In developing the EPPS, 
Edwards (4) scaled 140 items relating to 
14 relatively independent, normal personality 
variables* drawn from a list of manifest 
needs described by Murray (9). 

In the course of establishing high school 
norms for the EPPS (7, 8), the question was 
raised as to whether the pairs of items in the 
EPPS were equally well matched for social 
desirability when presented to groups other 
than the college population upon which the 
test was developed and standardized. Accord- 
ingly, a group of 206 high school students 
from Lincoln High School in Tacoma, Wash- 
ington was selected, and the items were re- 
scaled for social desirability. Six items relat- 
ing to heterosexuality were omitted as being 
unsuitable for verbal administration to large 
groups of high school students, and the re- 
maining 134 items were rescaled in the same 
manner as that utilized by Edwards (4). 

Prior to the scaling, the judges were asked 


1 This study was part of a doctoral dissertation 
completed at the University of Washington, 1956. 
Thanks are extended to Dr. Allen L. Edwards and 
to the public school officials of the Tacoma Public 
School System for their valuable assistance. 

2Now at VA Hospital, Northampton, Massachu- 
setts. 

8 The fifteenth need (abasement) was not scaled 
in the same manner, the scale values being esti- 
mated by means of the regression of probability of 
endorsement on social desirability scale values (5). 


to record their age, grade, sex, and a descrip- 
tion of their fathers’ occupation on a special 
blank. From the description of the fathers’ 
occupation, a socioeconomic status (SES) 
classification was made for each judge, based 
upon the SES occupational tables developed 
by the U. S. Bureau of the Census (2), which 
provide for six categories ranging from pro- 
fessional to unskilled worker. This classifica- 
tion was made by two psychologists working 
independently, and disagreements were dis- 
cussed and reconciled. 

Analyses of the judgments about the social 
desirability of each item were made sepa- 
rately by sex, grade, and SES. The median 
interval values of the items for the group of 
boys was plotted against the comparable 
values for the girls, and similar plots were 
made by grade and by SES. It was found that 
neither sex, grade, nor SES produced any 
essential differences in the intervals in which 
the median judgments for the items were 
found. The separate distributions were then 
combined to form a single distribution of 
judgments from which the social desirability 
scale values of the items were obtained by 
means of the method of successive intervals 
(3). These scale values correlated .94 with 
those derived by Edwards (4). 

In order to study the goodness of fit of the 
items which were paired in the EPPS for so- 
cial desirability, a comparison of the scale 
value for each item over all pairs was made 
by means of Fisher’s intraclass correlation 
and found to be .69. Edwards reported a simi- 
lar correlation of .85. Secondly, a comparison 
was made of the distribution of the absolute 
differences between the scale values for item 
pairs with the corresponding distribution ob- 


183 





184 


tained by Edwards. A test of the differences 
between the means of these two distributions 
yielded a ¢ of 3.51, significant beyond the 1% 
level. The mean of the absolute differences de- 
rived from the new scale values was .452, 
while that derived from Edwards’ scale values 
was .314. 

Finally, to ascertain the relationship of the 
new scale values and actual test performance, 
100 EPPS protocols were drawn at random 
from the high school normative sample (7, 
8), and the proportion of endorsements of 
the first item in each pair was computed. The 
correlation between the proportion of endorse- 
ment and the difference in scale values for the 
items in each pair was found to be .51. 


Discussion 


In the course of deriving his scale values 
for social desirability, Edwards (4) found no 
essential relationships between the interval in 
which the median value of the social desir- 
ability judgments fell and sex, age, or edu- 
cation. The present study shows no very cru- 
cial differences between the sexes or grades. 
Such differences between groups as were found 
seemed to be no greater or more frequent 
than could be expected by chance when study- 
ing as many as 134 pairs of median values. 
In plotting the data for the sexes, for exam- 
ple, only four pairs of median values differed 
by two intervals from the principal diagonal. 
An inspection of the raw data revealed that 
the median values fell near the limits of the 
interval in most cases, so the apparent dif- 
ferences in median judgments would be re- 
duced to somewhat less than an interval of 
two. 

A more unexpected finding was the lack of 
relationship between the median interval value 
as judged by differing socioeconomic groups. 
It would appear, from these findings, that 
subjects tended to judge social desirability in 
others on the basis of a common stereotype 
unrelated to grade, sex, or differential social 
class membership. 

The correlation (.94) of the scale values 
with those of the college group (5) demon- 
strated a remarkable stability in their rela- 
tive size. Fujita (6) obtained a comparable 
correlation with Edwards scale values using 
a group of college Nisei as judges, and 





C. James Klett 


Lovaas,‘ administering a translated version 
of the items to a group of students in Nor- 
way, obtained a correlation of .78 between 
his scale values and those of Edwards. The 
notion that such a pervasive and stable social 
desirability stereotype exists is a provocative 
one and has possible implications for social 
psychology as well as test theory. What con- 
stitutes socially approved behavior and the 
readiness with which individuals accept the 
stereotype may be utilized as dimensions in 
studying cultural differences. Whether the 
social desirability stereotype is sensitive to 
psychopathology is another avenue of applica- 
tion. The finding that the judged social de- 
sirability of an item did not vary among the 
subgroups mentioned would seem to minimize 
the contribution of this particular variable to 
personality test score differences, either be- 
tween socioeconomic groups or between vary- 
ing samples of the population. This has some 
relevance for test theory, because, if the con- 
cept of social desirability varied markedly, it 
would be difficult, if not impossible, in stud- 
ies such as reviewed by Auld (1), to deter- 
mine whether personality variables or social 
desirability variables account for the observed 
test differences. 

Since the format of the EPPS is constant, 
i.e., the items which appear in the pairs have 
been set, differences in social desirability 
scale values could be expected to change the 
probability of endorsement of either item in 
a particular pair. The goodness of fit of the 
paired items in terms of social desirability 
was evaluated by means of two statistics. The 
intraclass correlation between the scale values 
of item pairs and the ¢ test between the ab- 
solute means of the scale value differences 
within item pairs both indicated that Ed- 
wards’ scale valucs were more successful in 
matching the items in pairs than the present 
scale values. As Edwards was able to ma- 
nipulate freely the items forming the pairs 
until he obtained the best possible fit, such 
a finding is not surprising. It was expected 
that, with different scale values and predeter- 
mined pairs, the goodness of fit would only 
approximate that of the original fit. An im- 
plication which this has for the forced-choice 


*Ivar Lovaas; personal communication, June 15, 
1956. 


























Stability of Social Desirability in the Edwards PPS 


format, however, is that, even with a correla- 
tion coefficient of .94 between the new and 
the original scale values, the goodness of fit 
of paired items can be significantly altered. 

A test of the effect that goodness of fit has 
on forced-choice performance was provided 
by a comparison of the proportion of en- 
dorsement of item A of each pair with the 
difference in scale values of the pair. Ed- 
wards (5) was able to reduce the correlation 
between proportion of response and social de- 
sirability scale values of the item from .87 in 
a true-false format to .40 when the items were 
paired for social desirability. The new scale 
values correlated .51. This coefficient is not 
so low as that reported by Edwards, although 
the difference between the two correlations is 
not significant. 


Summary and Conclusions 


The Edwards Personal Preference Schedule 
(EPPS) (5) is so constructed as to pair items 
representing different psychological needs in 
terms of their social desirability scale values. 
Since the determination of the original scale 
values had been made on a college group, the 
present study was designed to determine how 
similar scale values obtained from a high 
school group, and from varying socioeconomic 
status groups within the high school popula- 
tion, would be to the original scale values ob- 
tained by Edwards (3). Further, an effort 
was made to determine what effect differences 
in scale values would have on the adequacy 
of matching of the pairs of items in the EPPS 
and on the probability of endorsement of the 
items. It was found that: 


1. There were no differences among socio- 
economic groups within the high school popu- 
lation as to the median value of their social 
desirability judgments on the items. There 
was no difference between the grades or sexes. 


185 


2. The social desirability scale values ob- 
tained from the high school group as a whole 
correlated .94 with those obtained by Ed- 
wards (3). 

3. Despite the high correlation between 
the two sets of scale values, tests of the good- 
ness of fit of the matched pairs in the EPPS 
revealed that the scale values as obtained in 
this study resulted in less adequate matching 
and a correspondingly greater relationship be- 
twen social desirability and probability of 
endorsement. 


Received June 25, 1956. 


References 


1. Auld, F. Influence of social class on personality 
test responses. Psychol. Bull., 1952, 49, 318 
333. 

2. Edwards, Alba M. Comparative occupation sta- 
tistics for the United States, 1870 to 1940. 
Sixteenth Census of the United States: 1940. 
Washington: U. S. Govt. Printing Office, 1943. 

3. Edwards, A. L. The scaling of stimuli by the 
method of successive intervals. J. appl. Psy- 
chol., 1952, 36, 118-122. 

4. Edwards, A. L. The relationship between the 
judged desirability of a trait and the prob- 
ability that the trait will be endorsed. J. appl. 
Psychol., 1953, 37, 90-93. 

5. Edwards, A. L. Edwards Personal Preference 
Schedule. Manual. New York: Psychological 
Corp., 1954. 

6. Fujita, B. An investigation of the applicability of 
the Edwards Personal Preference Schedule to 
a cultural subgroup. Unpublished master’s 
thesis, Univer. of Washington, Seattle, 1955. 

7. Klett, C. J. A study of the Edwards Personal 
Preference Schedule in relation to socio-eco- 
nomic status. Unpublished doctor’s disserta- 
tion, Univer. of Washington, Seattle, 1956. 

8. Klett, C. J. Performance of high school students 
on the Edwards Personal Preference Schedule. 
J. consult. Psychol., 1957, 21, 68-72. 

9. Murray, H. A., et al. Explorations in personality. 
New York: Oxford Univer. Press, 1938. 








Journal of Consulting Psychology 
Vol. 21, No. 2, 1957 


Overinclusive Thinking in a Depressive 
and a Control Group 


R. W. Payne 


Institute of Psychiatry, University of London, Maudsley Hospital 


and Heather L. Hirst 


University of London 


Cameron (1, 2, 3, 4, 5) believes that “over- 
inclusion” is one of the most important as- 
pects of schizophrenic thought disorder. The 
schizophrenic is unable to preserve his con- 
ceptual boundaries, so that irrelevant ideas 
become incorporated into his concepts, mak- 
ing his thinking more abstract and less lucid. 

Epstein (6) has contributed enormously to 
the operational definition of “overinclusion” 
by developing a simple _pencil-and-paper 
measure of this aspect of thought disorder. 
The Epstein test presents the subject with 
a list of 50 words. Following each stimulus 
word there are six response words (including 
the word “none”). The subject is merely 
asked to underline all those response words 
which are a necessary part of the concept de- 
noted by the stimulus word. Epstein predicted 
from Cameron’s theory that schizophrenics 
would undeline more response words than 
normals, as their concepts would be more 
overinclusive. This was in fact the case, the 
difference being significant at the .001 level 
of confidence. He also found no differences 
between normals and schizophrenics in “un- 
derinclusion” (the tendency to underline too 
few response words). As Epstein points out 
however, “Another question that arises is 
whether overinclusion is characteristic only 
of schizophrenic behaviour, or whether it is 
equally prominent in other psychoses” (6). 

It was the purpose of the present study to 
investigate “overinclusion” in depressives. If, 
as Cameron suggests, this form of thought 
disorder is typical of schizophrenics, depres- 
sives should not differ much from normals. 


Procedure 


Tests. The Epstein Overinclusion test and 
the Mill Hill Vocabulary test were given to 
a group of 11 depressives and 14 normal con- 
trols. The Epstein test was given and scored 
according to Epstein’s procedure.’ An “over- 
inclusion score” and an “underinclusion score” 
were then obtained. The Mill Hill Vocabu- 
lary test was also given in the standard way 
(9). Both tests were administered individu- 
ally, and the order of presentation was alter- 
nated. The vocabulary test was included so as 
to match the subjects for vocabulary level. 

Subjects. The patients consisted of 11 de- 
pressives, four males and seven females be- 
tween the ages of 33 and 56. All were inpa- 
tients of the Bethlem Royal or the Maudsley 
Hospital. All had been diagnosed depressive 
by both the registrar, and the consultant in 
charge of the case. All were regarded as rea- 
sonably typical cases of “endogenous” de- 
pression. 

The controls consisted of 14 normal peo- 
ple, chosen so that as a group they were 
closely matched with the patients for age, 
sex, occupation, educational level, and vo- 
cabulary level. There were five males and 
nine females, between the ages of 28 and 56. 
These data are summarized in Table 1. As 
can be seen, the controls do not differ signifi- 
cantly from the depressives in age or vo- 
cabulary score. 


1 The authors would like to thank Professor Ep- 
stein for making available to the psychology depart- 
ment of the Institute of Psychiatry his test and scor- 
ing key. 


186 

















Overinclusive Thinking in Depressives 


Results 


The results are presented in Table 2. As 
can be seen the depressives “overinclude” to 
a highly significantly greater amount than do 
the normal controls, contrary to what one 
might expect from Cameron’s writings. In fact 
the depressives in this sample have a much 
higher mean “overinclusion” score (37.91) 
than did the schizophrenics in Epstein’s study 
(20.92). This difference is probably signifi- 
cant, although it cannot be calculated as Ep- 
stein does not give the variance for his schizo- 
phrenic group. 

It is interesting that the control group in 
the present study obtained a mean “overin- 
clusion” score (13.79) almost identical with 
Epstein’s normals (12.49). As the present 
control group is English it is surprising that 
cultural and other factors seem to make so 
little difference. 

There were no significant differences with 
respect to the underinclusion score. This is 
consistent with Epstein’s finding. 

The present study also considered an addi- 
tional score, the “neologism” score. Five of 
the 50 stimulus words are neologisms, as are 
five of the response words. The neologism 
score was merely the total number of re- 


Table 1 


Characteristics of the Depressive and Normal Samples 














Depres- Normal Significance 

Measure sives controls of difference 
Education : 

elementary 8 11 

secondary 3 
Occupation : 

semiskilled 5 6 

skilled 5 6 

managerial 

or professional | 2 
Age: 

Mean 43.82 44.64 t = 0.23 

SD 8.51 9.36 p>05 
Mill Hill Vocabu- 

lary, raw score: 
Mean 52.91 57.21 t= 1.14 
SD 9.94 8.58 50 > p > 0.10 





187 


Table 2 


Overinclusion, Underinclusion, and Neologisms in 
Depressive and Normal Subjects 


Depres- Normal Significance 


Measure sives controls of difference 
Over- Mean 37.91 13.79 ¢ = 3.35 
inclusion SD 22.87 7.66 p<0.001 
Range 14to92 4to30 
Under- Mean 12.64 11.14 ¢=0.77 
inclusion SD 5.02 4.50 0.50> p>0.10 
Range S5to21l 6tol9 
Neologism Mean 3.27 1.93 = 147 
SD 2.37 2.12 0.50> p>0.10 
Range Oto6 Oto8 


sponses (other than “none”) underlined in 
response to a neologism, plus the number of 
neologisms underlined as responses. The de- 
pressives had slightly higher scores, but the 
difference did not reach significance. This sug- 
gests that the tendency to respond to neolo- 
gisms cannot explain the differences in “over- 
inclusion.” 


Summary and Conclusions 


The present results suggest that depressives 
“overinclude” significantly more than normals 
on Epstein’s test. In fact depressives are prob- 
ably more abnormal with respect to “overin- 
clusion” of thinking than are schizophrenics. 
This is inconsistent with Cameron’s theory, 
as he appears to regard this type of thought 
disorder as specific to schizophrenics. Two ex- 
planations could account for these results: 

1. It is possible that “overinclusion”’ is re- 
lated to “psychoticism” rather than to schizo- 
phrenia specifically. It is also possible that 
the depressives in the present study were 
more “psychotic” than the schizophrenics in 
Epstein’s study. Similar results have been re- 
ported by Eysenck (7, 8) if “psychoticism” 
is defined operationally in terms of a “factor.” 

2. It is possible on the other hand that 
“overinclusion” is merely related to the spe- 
cific symptom of depression. Schizophrenics 
as a group are probably more depressed than 
are normals, but not as depressed as depres- 
sive patients. 


Received June 25, 1956. 





188 R. W. Payne and Heather L. Hirst 


References Hunt (Ed.), Personality and the behavior dis- 


k . , orders. New York: Ronald Press, 1944. Pp. 
. Cameron, N. Reasoning, regression and communi- 861-921. 


cation in schizophrenics. Psychol. Monogr, 6 Epstein, $. Overinclusive thinking in a schizo- 
1938, 50, No. 1 (Whole No. 221). , 
vo» Pape : phrenic and a control group. J. consult. Psy- 
. Cameron, N. A study of thinking in senile de- chol., 1953, 17, 384-388 
terioration and schizophrenic disorganization. Ee - H ' S chi th “na di 
Amer. J. Psychol., 1938, 51, 650-664. : ysenck, - J. Schizo ee RAVENS 28 A. 
. Cameron, N. Deterioration and regression in mension of personality: II. Experimental. J. 
schizophrenic thinking. J. abnorm. soc. Psy- Pers., 1952, 20, 346-384. 
chol., 1939, 34, 265-270. 8. Eysenck, H. J. The scientific study of personality. 
. Cameron, N. Schizophrenic thinking in a prob- London: Routledge & Kegan Paul, Ltd., 1952. 
lem-solving situation. J. ment. Sci. 1939, 85, 9. Raven, J. C. Guide to using the Mill Hill Vo- 
1012-1035. cabulary Scale with Progressive Matrices 
. Cameron, N. The functional psychoses. In J. McV. (1938). London: Lewis, 1950. 

















CONTENTS OF NO, 4, VOL. 19, NOVEMBER, 1956 

Edith Weigert: Human Ego Functions in the Light of Animal Behavior. 

Roy M. Whitman: The Rating and Group Dynamics of the Psychiatric Staff 
Conference 


Lionel Ovesey: ‘Masculine Aspirations in Women. 


H. Waldo Bird and Peter A. Martin: Countertransference in the Psycho- 
therapy of Marriage Partners. 


Elaine Cumming and John Cumming: The Locus of Power in a Large 
Mental Hospital. 


| Donald D. Glad: An Operational Conception of Psychotherapy. 


Edwin A. Weinstein, Robert L. Kahn, and Sidney Malitz: Confabulation 
as a Social Process. 


Editorial Notes. 

Brief Communications. 
Book Reviews. 

Author Index. 

Volume Contents. 

















Published in February, May, August, and November by the 


William Alanson White Psychiatrie Foundation 
1703 Rhode Island Ave., N. W., Washington 6, D. C. 


Salen price $10.00 for 1956; foreign $10.80 
eee copy $2.50; foreign $2.70 























THERAPEUTIC SCHOOLS - VOCATIONAL COMMUNITIES - REMEDIAL CAMPS 





ACCEPTANCE 


THE HBART OF THE DEVEREUX 
Schools’ technique is “acceptance” 
of each individual child, whatever 
his functional level of performance 
or his present degree of maturity. 
He comes to feel that the profes- 
sional staff members with whom he 
comes into contact have faith in his 
capacity for growth. 


To assist in this the staff pro- 
vides the student with a wide range 
of therapies—medical or psychi- 
atric treatment, psychological coun- 
seling, or psychoanalysis when in- 
dicated. The same staff evaluates 
every doy or girl on admission, in 
order to determine individual needs 
and to ensure proper placement in 
whichever of the score of home- 
school units that is best suited to 
his needs. 


Professional inquiries should be addressed to John M. Serciey, Director of Development, or Charles J. 
Fowler, Registra:, Devereux Schools, Devon, Po. For the western states, address Joseph F. Smith, Super- 
intendent, or Keith A. Seater, Registrar, Devereux Schools, Santa Barbara, Calif, 








avereiix Schools 


UNDER THE DEVEREUX FOUNDATION 
HELENA 7. DEVEREUX, Birector 


Professiono!l Assoviote Directors 
ROBERT 1. BRIGDEN, Phd. EDWARD 1. FRENCH, Ph.D. 
MICHAEL 6. DUNN, Pht J, CUFFORD SCOTT, M.D. 


Santa Barbara, Cailiornia Devon, Pennsyivania 














ry * < ” af _— oe . , “_ * Ste 2 eee ll — a , 
ca ‘. “ ” , os . co A _— : hai ecu a ' < . * 
, ony . . * * Ga = . sis ™ . a * 
+ : 3 a ——aapengy- % . Ps w as 2 Sal ‘ \ oe 
Pus ; z g. . sepa, eae _% - 7 eal a , % ‘ 
« - es we “a * 2 - - _ “ > - : . re 
. * : 
» * “ + *, . . bs 7 & ' 
’ Py % F . ns ~ g eS oe 5 ; 4 ° 5 , | 
‘ ' ' : F 3 
. —— 
' bd *. F ‘ ¢ 2 “ ; g - 
, Ps 2 
x ’ 
. " e 4 - 
“ ‘ . hi . 
: ' - . . E . . . . ’ . 
> es 2 ‘ , P _- t ¢ £ 
’ ; be 
P : . KP ses : : : 7 . : ° 
: - = ¥ . ; oa ; 
° * bed ~ 
ss i . 
Pe ae : 3 . i ~ ° 
‘ rs ® ; 
neers * . 
. be e 7 . 4 7 ' . 
7 - ‘ ‘ 7 . - ' baw J 
= m - s F . * 
4 ‘ oe - ’ § . 
~ > - = pe .* * ae 
’ , ' ; 
, . . we 
. s v ' 
= « . ° 
. xs 
‘-* 
. 
. 
. ‘ 
‘ 
° 
~ 
. . 
a 
. 
. 
“ a 
’ — VA 
/ - 
‘ e al 
i ae < 
. a * ore - bf 
e 7 t - panera 
e”* ° - 
<3 
’ 
' ¥. . we 
: ° ‘ ’ j 
= : : 
— *. 
’ , ’ ‘ ° 
J F 
“a Pa nd : i 
—_ , ‘ ‘3 ‘ 
. . 
. . 
° ’ * " t — 
‘ * 
. 
fs . —— 
- od - - sal 
1 : ;: / i an 
‘ 
‘ - 
a -— eon 
' 
e . 
’ 
° J « 
Fs . 
ry 
. a} 
. 7 - 7 
oy] 
, 
' - N ” 
’ 
F o ' 
’ ’ 
o 
: . . 
7 
’ ‘ . 
. 7 ~ , 
. ° ee . 
i se * s ~ = 
? : _ 4 
> 
. ; , 
Pr ; ‘ 
. 





